fbpx

The Fundamental Problem of Spoken Mathematics

Background

Currently, secondary school age students are in large part unable to receive accessible instructional materials in mathematics and science. The natural result of this lack of materials is a general deficit in knowledge and skills in math and science by students who are print disabled. This lack of accessible instructional materials has unfairly limited students who are print disabled in that they often lack the opportunity to study math and science. Because of a lack of access to science and math instructional materials the print disabled are ultimately unable to seek employment in the growing high-tech job sector. If print disabled individuals have an equal opportunity to study mathematics and science, many new career opportunities will be opened up to these individuals that were previously not viable options.

General Description of MathSpeak™

Access to Math and Science information is a real problem for students with print disabilities (disabilities that prevent them from normal reading of the printed page). Students with print disabilities have a very hard time understanding the complex math equations that typically occur in Math and Science textbooks by just listening to someone read the math to them. This is mainly because of the lack of a standard for spoken mathematics, and also the traditional problems associated with reliance on a human assistant. This is a problem that can affect the ability of students from grade school through graduate school to learn. SeeWriteHear (SWH) MathSpeak™ technology solves this problem by combining a solid standard for spoken mathematics with high-quality computer synthesized speech. This allows the student to work by themselves at their own pace and retain ownership of the ideas learned. The two facets of the MathSpeak™ solution are the standard itself, and the computer synthesis for the production of audio renderings.

The MathSpeak™ Standard

The MathSpeak™ standard itself is very powerful since it is based on the fundamental principles of the Nemeth Braille Code for Mathematics and Sciences, the current standard for encoding mathematics into Braille. This code, developed by Dr. Abraham Nemeth, a SWH employee at the time of development, allows superior access to mathematics by conveying the information unambiguously and concisely using a special grammar and lexicon unique to mathematics. Dr. Nemeth has mapped the advantages of the Braille code over into a special spoken language for mathematics called MathSpeak. It is this language that SWH developed from theory to practice.

The power of the MathSpeak standard can best be understood by a simple example of the root problem. Consider the following simple mathematical equation as it would likely be read by a human reader:

x equals a over B plus 1

When visualizing this equation, there are actually two possible meanings (or visual renderings) for this one voicing, as shown here:

Two possible visual renderings of x equals a over B plus 1.

Which is the correct version? For a print-disabled student taking a test, the answer is crucial. Unfortunately, current techniques for the human production of audio for math are rife with these kinds of ambiguities, in addition to being of inconsistent quality, expensive, and time-consuming to make. The reality of everyday life as print-disabled Math and Science students is that most materials are not available in alternative format and hence human assistants must be constantly employed, which creates a drain on both time and money for both the student and the school.

MathSpeak offers a precise, perfectly consistent version of the above equation each and every time the student listens to it:

x equals BEGIN FRACTION a OVER CAPITAL b END FRACTION plus 1.

The words in ALL CAPS are special reserved words in MathSpeak that are used to indicate to the listener what the actual semantic meaning of the equation is meant to be. The above MathSpeak snippet can be interpreted (or visually rendered) in only one, unambiguous way:

Visual-rendering-of-x-equals-BEGIN-FRACTION-a-OVER-CAPITAL-b-END-FRACTION-plus-1.

Note that both the proper contents of the fraction and the fact that the denominator is a capital (as opposed to lowercase) variable are indicated by the use of MathSpeak. This is but one of the many advantages to the use of an automatically generated, systematic standard.

Another example of the power of MathSpeak comes from the fact that the grammatical system that it uses provides immediate feedback as to the current location of the listener in a complex equation. This means that a listener can actually follow along as a long string of math is read without getting “lost.”  A simple example helps to explain the problem. Consider the following equation:

Visual rendering of spoken equation y equals x SUBSCRIPT j SUPERSCRIPT 2e SUPER-SUPERSCRIPT minus i SUPER-SUPER-SUBSCRIPT n SUPER-SUPERSCRIPT pi BASE.

In MathSpeak, this would be spoken as follows:

y equals x SUBSCRIPT j SUPERSCRIPT 2e SUPER-SUPERSCRIPT minus i SUPER-SUPER-SUBSCRIPT n SUPER-SUPERSCRIPT pi BASE.

Although this equation is complex and difficult to listen to regardless of the circumstances, MathSpeak represents the best available method of conveying the information at hand. During any part of the equation, the listener can deduce exactly what level of super- or sub-script that they are currently hearing without having to wait for more context cues. Hence, the subscript of “n” for the variable “i” in the second-level superscript can be properly identified as SUPERSCRIPT SUPER-SUPER-SUBSCRIPT or “go up, up again, and then down.“

All of this work must be encompassed in an XML framework in order to allow automatic generation of the audio and in order to fit into SWH’s standard production processes.

The MathSpeak Variables

MathSpeak currently has four defined variables that determine how mathematical expression will be rendered. They are:

  • Verbosity
  • Explicitness
  • Semantic Interpretation
  • Language

The Verbosity variable currently has two settings, “verbose” and “brief.” This variable only affects the lexicon used, with the brief verbosity having more terse pronunciations. For example, for the fraction “x / y,” verbose would say “begin-fraction x over y end-fraction,” while brief would say “b-frac x over y end-frac.” For uncommon symbols, there is often no difference between verbose and brief. The verbosity variable was created so that “verbose” would be easier to learn initially and “brief” allows a smooth transition to a more efficient pronunciation that is easier to understand complex equations.

MathSpeak sometimes relies on shortcut/exception rules that makes math easier to say. This can make some expressions confusing at first. Therefore, the Explicitness variable was created so that expressions could be understood without knowing these exceptions. The default setting of the Explicitness variable is “off” and currently the only other setting is “on.” As an example, with Explicitness off the variable x with a subscript of 1 is spoken as “x 1.” If the listener finds this confusing they can turn on Explicitness and get “x subscript 1.”

Semantic Interpretation (SI) is a variable that is also takes the values of either “on” or “off.” Semantic Interpretation adds additional rules that allow math to be spoken more naturally and when possible, conveys expressions by what they mean instead of what they look like. Take the example of numeric fractions such as “5/8.” Without SI it would spoken as “begin-fraction 5 over 8 end-fraction,” and with SI it would be spoken as “five-eighths.” As another example take the expression “|x|.” Without SI, it would be spoken as “vertical-line x vertical-line.” With SI, it would be spoken as “the absolute value of x.”

The Language variable currently only allows English. There are plans for MathSpeak to be converted into other languages. The plan is for the different languages to differ only in the lexicon that is used, such that there is a one-to-one mapping between languages.