There is a consensus among all readers that the topic is interesting and the work contains some novel ideas. There is also agreement on several serious problems. One problem, mentioned by all three reviewers, concerns the writing and organization of the manuscript. This stylistic issue may interact with more substantive ones. The paper has several different foci, and while some interesting things are said about each, it is not clear that a real advance has been made on any front. For example, the speculation about a new form of neural coding is fascinating but mainly a conjecture at this point. Reviewer B offers many detailed comments, some of which call into question the uniqueness of the model and the specific role played by harmonic resonance. As Reviewer A points out, it does not lead to much in the way of testable predictions. The analysis of current models of interpolation points up limitations and important issues, but as Reviewer C notes, it is not clear how the proposed model, predicts what is perceived. Finally, if the model can be tightened up to make better predictions, experimental tests would be desirable. Running simulations to postdict particular displays is at best an exercise preliminary to confirming a model by scientific experiment. Accordingly, I must decline the paper for publication in Perception & Psychophysics. I would echo, however, Reviewer A's comment that the speculative content and non-experimental style of the current manuscript might fit more comfortably in some other kind of journal. Alternatively, a theoretically-clarified and experimentally confirmed version of your work might make an excellent new submission to this journal at a later time. I am enclosing copies of the reviews. I hope they are helpful in your continuing work. Good luck in further developing your intriguing ideas.
Maybe I'm not the right person to evaluate this MS: I found it irritating because of the many adhockeries and nonsequiturs. The MS is of a purely theoretical (to my taste: speculative) nature. The theory isn't really firmly grounded in empirical fact and doesn't lead to any hard, testable predictions. Thus in my opinion it doesn't fit very well in this journal: there exists journals expressly aimed at contributions like this (I expect that speculation has a valid place in science)
At the moment it doesn't seem very useful to add specific remarks. If the MS is considered for acceptation by P & P I would be willing to contribute a list of specific comments if the editor would appreciate such.
The manuscript elaborates on one of Grossberg models of early vision, suggesting a new form of coding in the visual system, and attempts to account for some visual illusions. The basic problem tackled here is the integration of local image features into some global percepts (Gestalt). The author suggests a specific architecture for integrating local features, partly based on some physiological theories and partly on intuition. While the problem is extremely important and the approach is very attractive, the outcome is somewhat confusing, maybe because of the style of writing. I guess a more carefully written full length paper has a better chance for making the theory clear. I would encourage the author to make the effort and to write an extended manuscript. Below I list some of my confusions while reading the manuscript.
On the `Harmonic Resonance' code:
The harmonic resonance code assumes a specific type of connectivity between orientation units to enable standing waves within the network of oriented cells. It represents different types of line intersections (Figure 5) by icons (standing waves) of similar shape. As such, I do not see much advantage in this coding, though it can serve for image enhancement very much like lateral inhibition between orientations.
The author does not solve the problem of oscillations in a ring of connected cells (what would happen to phase then? i.e. to absolute orientation) but rather assumes some filtering operation (Equation 6). Thus his hypothesis is equivalent to assuming excitatory and inhibitory interactions between orientation units. Indeed, he ends up with some local convolution in the orientation domain (Equation 8). It would be interesting to take a model of orientations cells, with all known (or hypothesized) interactions and explore the parameter subspace which allows for standing waves to survive.
The role of harmonic resonance in the present simulation is not clear. Simulation results shown in Figure 6 present the response sum of all orientation units at every location, and over all orientation frequencies. Equation 10 represents summation over all filter frequencies in Equations 8 and 6 and thus the whole operation can be reduced to a convolution with a single filter which is the sum of all harmonics. (I am afraid we are getting back to pure orientation representation).
The speculations concerning the hardware are premature. The statements about gap junctions are distracting and confusing and it is not clear why they are needed in the present context. For synchronous spike oscillations a one msec temporal resolution is sufficient (gap junctions are used in the auditory system where much better resolution is needed).
Technicalities: The mathematical formulation presented in the Appendix is somewhat confusing. In Equation 1, the absolute value should be on the total sum and not on the individual items, unless some special nonlinear convolution is assumed. Also in Equation 1 the q subscript of O has no corresponding parameter on the right side, though p seems to be a good candidate. In general, it is prefered to keep subscripts for function parameters and avoid indexing functions by variables. I assume that a Gabor filter is a function of spatial coordinates x and y, thus should be written as F(x,y), or if the author wishes as F(i,j). The orientation parameter, as well as other Gabor parameters can be attached to F as indices. It is also confusing to use i as a Gabor variables, as it can be confused with the i used in complex numbers [see the exp() in Equation 2]. I understand these conventions fit the discrete model computer implementation (and are used by the Grossberg school extensively), but the outcome is somewhat confusing. Also, please add a glossary with all functions and variables described.
Criteria for successful modeling: I would prefer to see some quantitative predictions for experimental data and not only qualitative demonstrations. The problem addressed by the author is of major importance and receives increasing attention in the literature. Some papers [references l & ll] present detailed data, challenging all modelers. I list below some additional references.
Bonhoeffer & Grinvald (1993) The layout of iso-orientation domain in cat area 18: Optical imaging reveals pinwheel-like architecture. Journal of Neuroscience 13, 4157-4180
Field, Hayes & Hess (1993) Good continuation and the association field: evidence for local feature integration in the visual system. Vision Research 33, 173-193
Kovacs & Julesz (1993) A closed curve is much more than an incomplete one: Effect of closure in figure-ground segmentation. Proceedings of the National Acadamy of Science, USA 90, 7495-7497.
Lesher & Mingolla (1993) The role of edges and line-ends in illusory contour formation. Vision Research 33. 2253-2270.
Polat & Sagi (1993). Lateral interactions between spatial channels: Suppression and facilitation revealed by lateral masking experiments. Vision Research 33, 993-999
Polat & Sagi (1991) Spatial interactions in human vision: from near to far via experience dependent cascades of connections. Proceedings of the National Academy of Science, USA 91, 1206-1209.
Lehar describes a modification of Grossberg's BCS approach to illusory figures. This revised model incorporates the potential for harmonic interactions. Such interactions, the author suggests would allow the visual system to interpolate both smooth contours and vertexes (an intersection of two or more edges). The author applies this idea to some illusory figure displays and some dot grouping displays, and reports the results of a computer simulation.
The idea of harmonic interactions among groups of cooperative cells is intriguing, and explanations of visual phenomena in terms of dynamic interactions are certainly topical. Furthermore, as current theoretical accounts of the perception of illusory corners are lacking, any contribution to explaining such phenomena would be welcome. However, this manuscript is simply not ready for publication. There are three significant problems.
1) The manuscript has several conceptual weaknesses. Perhaps the most problematic, is the authors treatment of the problem of mapping the simulated output onto actual percepts. This problem is not trivial. The author considers the model successful when some of the activity in the simulated output matches a percept. But, the model seems to suffer from an embarrassment of riches. The model does construct "illusory intersections" but it also constructs many more "edges" than are seen in any given display. The model is missing some mechanism that selects the subset of "edges" that are seen. Simply applying a threshold would not be sufficient as this would not capture the self organizing stability of human percepts (e.g. in Figure 6D, why is a circle seen some of the time, and a square seen at other times, AND the two are never seen simultaneously? [also see Bradley's chapter in the Petry & Meyer book for other examples of multi-stable illusory figures]).
The attempt to address both illusory forms and dot grouping, while laudible, is also problematic in the absence of a description of how to go from the simulation output to the percept. Why should these two phenomena differ in phenomenal appearance if the same unit formation mechanism is active in both cases?
2) The writing and organization of the manuscript is also weak. Overall, the manuscript reads like parts of two papers- one paper about a directed diffusion model and its problems, and a second paper on a harmonic model. The result is a disjointed paper with insufficient introduction to a resonance based approach to perceptual organization.
The discussion and review of the relevant literature is also insufficient. The author mischaracterizes the Gestalt psychologists. I suspect that the author would find the original Gestalt texts (e.g. Koffka, Kohler, & Wertheimer) and their ideas on dynamics and electrical fields intriguing given his interest in applying the recent work on gap junctions to unit formation. Additionally, if the author decides to retain the discussion of spatial effects on illusory contour displays (that are used to introduce the directed diffusion model) then there should be references to Lesher & Mingolla's 1993 Vision Researchg paper, as well as Shipley & Kellman's 1990 Perception & Psychophysics paper.
Finally, the logical arguments need to be tightened up. Some examples: it is not always clear when the author is using "acoustical resonance" as an analogy and when it should be treated as formally identical to the system described; on pp. 9-11, models provide evidence for visual processes, theories are supported by neurophysiological findings that appear agnostic with respect to the central ideas of the theory, and simplicity lends "credence and veracity".
3) Finally, the manuscript deviates considerably from the APA style guidelines. The author should read the "Publication Manual of the APA".