ScanSoft Dragon NaturallySpeaking 6
Mia Roop of ScanSoft and Bob Zwolinski of Dragon Systems kicked off the presentations with a demonstration of the latest version of a popular application that analyzes speech and converts it to editable text—Dragon Naturally Speaking 6.
Voice recognition systems have had ownership shifts over the recent past: ScanSoft bought DNS from Lernout and Hauspie, who had earlier bought Dragon Systems. At the time, L&H also had Voice Express, RealSpeak, and DictaPhone. L&H went bankrupt a few months later (financial troubles, including alleged accounting irregularities that prompted the arrest of the company's founders—sound familiar?).
ScanSoft has also acquired Caere's best-known products, the OCR application OmniPage and other scanning software and digital document management titles. ScanSoft plans to put speech recognition into automobiles, develop telephony-based products, and continue the Dragon series of software. Based in Peabody, Massachusetts, and publicly traded on NASDAQ, ScanSoft has strong vertical markets in legal, government, education, medical, and financial services areas.
ScanSoft is the global market leader in advanced speech and language solutions, and has brought together the best technologies from Voice Express and Dragon Naturally Speaking to derive a superior product, a "best-of-breed product," Dragon Naturally Speaking version 6.
The number one reason why speech-to-text applications are in use is to increase productivity. You can talk a lot more quickly than you can type. Second, reduce risk of RSI—repetitive stress injury (carpal tunnel syndrome). Third, it's a "natural" interface—everyone knows how to talk. And fourth, it promotes standardization.
Scansoft publishes several editions of DNS6: Preferred, Professional, and several custom vocabulary versions (medical, legal, etc.). The professional edition includes "macro" capability and a 250,000-word vocabulary, not to mention a big difference in the price. Preferred has a suggested price of $199 (upgrade from DNS5 is $149) while Professional is $695. Special vocabulary editions start at $995. Macro capability allows the user to create elaborate textual and command sequence constructions that will activate based on a select keyword or phrase. Still, in Preferred, "dictation shortcuts" are available—whole paragraphs are inserted when DNS hears a key phrase.
Anything one can do on the computer can be controlled by voice: create new documents, send e-mail, automate repetitive tasks, and anything else one does on an ongoing basis. A 300-word document, if typed, takes about six minutes for an average typist. But by speaking that same information, the document is created in about two minutes. That means DNS6, at optimum performance on a modern computer, can recognize speech spoken at about 160 words per minute.
DNS6 is easy to use. Of past versions, training DNS took an initial 30 minutes and continued learning as time went on. Eventually, weeks later, DNS would reach an accuracy level that was tolerable, but by no means respectable. According to ScanSoft, DNS6 requires a training period of 5 minutes for 90% accuracy. Any misrecognitions are easily fixed and words not in the vocabulary are easily added. Adding to the vocabulary can be accomplished by having DNS6 scan existing documents including word lists.
Hardware requirements, at a suggested minimum, are a Pentium II class CPU and 128 MB of RAM. But, as always, the faster, beefier computers will get faster, more accurate recognition performances. An echo-canceling microphone/headset comes in the box. ScanSoft also recommends, for those who are building their own computers, using a "SoundBlaster Live"-class sound card.
ScanSoft offered attendees of tonight's meeting and members of the SPCUG a price of $99 for the Preferred edition.
Bob then gave a live demonstration of DNS6 and answered questions. The performance of DNS6 was impressively accurate but still had a few misrecognitions. DNS6 adds a new technology to the DNS line called "nothing-but-speech." This technology screens out non-speech articulations, such as "uh…," "umm…," "er…," and most instances of stuttering and false starts.
He also demonstrated DNS6's "playback" function. If what DNS entered is not even close to what the user believed was said, the playback of the utterance, as recorded by DNS and saved to disk as a temporary file (which could then be saved permanently as a Real Audio file), will reveal a probable "speech-o," akin to a "typo."
Bob mentioned that DNS6 Professional can recognize English with certain foreign accents: British, UK, Australian, and Southeast Asian speakers. And, generally speaking, any accent will require a longer training period and initial accuracy will be about 87%. He also mentioned that one can't use DNS to cheat at transcribing from tape (or, at least, not very effectively). DNS requires the speaker to verbalize punctuation.
Yes, he said, Windows XP comes with voice recognition, which is about as powerful, compared to DNS, as Paint is to a drawing program and WordPad is to a word processor.
Alpha Software Alpha 5
Ray Difazzio representing Alpha Software came up from San Francisco to show the product he uses to develop database applications for his clientele. He is an advocate for Alpha 5, from Alpha Software, a product he believes is the best performing personal productivity and small business application.
Ray demonstrated some typical user activities on version 4 (version 5 is due out 3rd quarter). Right out of the box, Alpha 5 includes several templates, little applications that are already built-up from scratch, which the user can modify if so desired. The applications include Invoice, Contact Manager, and dozens more. Alpha 5 is so versatile, applications are limited only by one's imagination.
Alpha 5 is a fully relational database management system. Relational means that for each record in one recordset, a relation exists between it and many other records of another recordset: a one-to-many relation. For example, recordset A may contain information on doctors. Recordset B may contain information on patients. For each doctor, a relation exists between that doctor and any number of patients. Alpha 5 presents these relationships in a graphical control panel.
In his presentation, Ray built a new database—mainly a collection of tables but it includes other things as well – based on the included zip code table. For a table, field rules allow one to enter calculations that result in a value the field expects, a date for example. For maximum productivity, users should view and edit the records through forms, or user-created control panels, which are easily created.
Alpha 5 has two ways of programming an application: action scripting or a full-featured language called XBasic. Ray quickly applied two commands to a button on the form he was working with. When the button was clicked, it executed those two commands: displaying a custom message in a message box, and automatically advancing and displaying the contents of the next record on the form. These were simple tasks, but in as little as two hours, highly complex calculations, look-ups, and modifications can be programmed.
Alpha 5's version 5 is in beta and is due to be released shortly. Additional features and capabilities include report generation straight to PDF, and a built-in e-mail client for reports sent straight out the Internet. Later minor versions will have Alpha 5 completely Web-enabled, permitting users to log into the database via a Web browser, and complete ActiveX Database Objects (ADO) integration. Almost universal foreign file format recognition and conversion is also planned.
An online message board exists for users to post questions and engage in general discussion of Alpha 5 and Alpha Software publishes a monthly newsletter.