Visual Tracking Using Neural Networks
-- Tennis --
Find the truth, and act on it!
--------------------------------------------------------------------------------
----------------------------------------------
This is a study of visual tracking -- or the ability to "keep your eye on the ball." It's a follow-up to an earlier study. The purpose is to learn:
This read/react function is a difficult parallel processing task for the player and there has to be a lot going on in the brain to accomplish it. Considering that the brain has many circuits, it is reasonable to expect a difficult distributed parallel programming job. I try to apply neural network techniques to get closer to what's happening.
-----------------------------------------------
I assume that all experience is patterned. This relates to the player's perception of his tennis environment and information derived from the ball trajectory. That is, the experience is synthetic, or gestalt-like -- put together out of pre-empirical materials (the energy I presume impinges on our senses) and mental models (the way we organize the stuff that impinges on our senses). If this smacks of Immanuel Kant, don't hold him responsible for anything I say.
I also assume we are capable of making distinctions -- patterns against patterns and patterns within patterns. (If this reminds you of G. Spencer Brown, he's not to be held responsible for what I say, either!) . In perception we can -- because we do -- isolate the tennis arena from the broader background, and we can distinguish players and spectators, the ball and racket, and all of the other paraphernalia. We also detect and recognize the trajectory of the tennis ball struck by a player. All have to be patterned phenomena -- else how could we identify them?
Also, I presume the existence of a tennis arena within which the tracking occurs. This is a stable base that provides the orientation or reference for tracking and interception. The ball itself is seen as part of the arena, or embedded in it, and it is this presumption that gives meaning to, and directs the selection of, trajectories for input to our network.
The distinction between a rational (or deductive) and empirical (or inductive) procedure fairly well describes the difference between my initial study of the tracking problem and this follow-up. The original study, like this one, presumed the existence of an arena within which the tracking occurs, but it also depended on using physical principles to define the trajectories and assumed specific trajectory properties to be the athlete's source of information in the tracking function. I also used the step-by-step approach of traditional digital computer programming to reach conclusions from my simulation of the process. So my approach was strongly deductive.
In this follow-up, my approach is more bottom-up, or inductive. This is to say that I assume a neural network model that will try to generate appropriate information patterns from a large sample of somehow -- possibly randomly -- selected trajectories. Network learning will take the system from lower to higher concepts.
As for all networks, it's also necessary to design a transformation function that relates outputs to inputs. I presume the network will try to produce general principles comparable to those I used in my initial study describing the motion of the trajectory, though again I have to say I'm struggling here to find words to express something I don't yet grasp. The sense is that of reaching tighter, more discriminating rational principles.
If I were to adopt a supervised training approach for the network, I imagine I'd use a formally described trajectory as standard -- i.e., as the desired network output -- and adjust the weights of the network interconnections to produce a match. The training would use an error function that would reduce to a minimum value of about zero to indicate a completed training session.
The aim of the training itself is to raise the player's competence level of the track and intercept function. The training affects both the observation and the motor component of the skill and thus changes the player's awareness of the environment -- learning changes the environmental pattern, in effect creating a new environment. Whether the change is for the good, or not, though, depends on the quality of the training.
----------------------------------------------
In what amounts to learning to perceive and act in the tennis environment, the network has to find a way to track and intercept the oncoming ball. This presumes the network can "detect" and "recognize" a tennis ball, that the ball is in the network's "experience," hence already patterned.
This is a huge assumption, but no more than what is presumed when training a real individual to track and intercept a ball. The trainee's ability to detect and recognize the ball is presumed, probably because we don't know how to do the training separately or test the results. The objective is to read the motion of the ball and make the intercept.
What should be apparent is that the ball is tracked in the context of the court, not in isolation. If this isn't obvious, let me point out that we don't know how the ball is tracked, that it isn't clear what it means precisely to "keep your eye on the ball." It's likely that the tennis background is a critical part of the process, because the arena itself provides the location and orientation references for the player's sensory-motor processes.
The importance of the tennis arena shouldn't be overlooked. Without some concept of the game, it wouldn't make sense to direct attention to the "tennis ball," the "tennis net," the "tennis racket," the "tennis players," or to any of the rest of the tennis stuff. Each presumes the game and the arena. We learn skills and guide our actions by means of the environment. We learn to read (observe with understanding) and we learn to respond in appropriate ways to what we observe. Visual tracking, in the tennis arena, is an important component of the observation. And learning how better to read it improves its quality as a reference.
On the one hand, the perceived tennis court acts as a reference for the path of the ball. And on the other, the trainee's perception changes through training. So there is an interaction between perceiving the court environment and performing the track and intercept skill.
It should also be clear that tracking the ball and moving to intercept it are interactive and overlapping functions and should be treated that way in the network operations. The network, being a pattern recognition device at heart, has to consider spatial as well as temporal (static as well as dynamic) patterns.
------------------------------------------------------