adintool

Langue: en

Autres versions - même langue

Version: LOCAL (ubuntu - 07/07/09)

Section: 1 (Commandes utilisateur)

NAME

adintool - audio tool to record/split/send/receive speech data for Julius

SYNOPSIS

adintool -in inputdev -out outputdev [options...]

DESCRIPTION

adintool analyzes speech input, finds speech segments skipping silence, and records the detected segments in various ways. It performs speech detection based on zerocross number and power (level), and records the detected parts to files or other output devices sucessively.

adintool is a highly-functioned version of adinrec. The supported input device are: microphone input, a speech file, standard tty input, and network socket (called adin-net server mode). The speech segments are saved to output devices: speech files, standard tty output, and network socket (called adin-net client mode). For example, you can record the incoming speech segments to files with successively-numbered suffixes, or send them to speech recognition engine julius to recognize them.

The output is not buffered: the receiver can get speech data with only a slight delay after a speech starts. The speech detection algorithm is as same as that of adinrec.

Output format is WAV, 16bit (signed short), monoral. If the file already exist, it will be overridden.

INPUT

The input device should be specified by one of the following options:
-in mic
Microphone input (default)
-in file
Speech data file. Supported format is RAW (16bit big endian) and WAV (no compression) etc (depending on the compilation time setting).
The input file name should be given later: prompt will appear after startup.
-in adinnet
Make adintool "adinnet server", waiting for connection from adinnet client and receiving speech data from there via tcp/ip socket.
Default port number is 5530, which can be altered by option "-port".
-in netaudio
If supported, get input data from NetAudio/Datlink server. Host and unit name should be given with "-NA host:unit" option.
-in stdin
Read speech data from standard tty input. Only RAW and WAV format is supported.

OUTPUT

Specify one of these below to select an output device which the detected speech segments are going to written to.
-out file
Output to files. The base filename should be given by option like "-filename foobar". Actually, the detected segments are recorded in separate files such as "foobar.0000", "foobar.0001" and so on. The four-digit ID begin with 0. This initial value can be set explicitly by option "-startid". The output format is WAV, 16bit signed. This can be changed by "-raw" option.
-out adinnet
Make adintool "adinnet client", making connection to an adinnet server on a host, and send speech data to the server. The hostname should be specified by option "-server". The default port number is 5530, which can be altered by option "-port". The available adinnet server so far is adintool and Julius.
-out stdout
Output to standard tty output in RAW, 16bit signed (big endian).

OPTIONS


-server host[,host...]
Server(s) to connect with "-out adinnet". With multiple server, port number for each host should be specified by comma-separated list. (default: 5530)
-port num[,host...]
Port number to connect with "-out adinnet". Should be correspond with "-server"
-nosegment
Re-direct whole input speech data to output device, without speech detection and segmentation. With this option, the output filename does not have its four-digit ID appended.
-oneshot
Record only the first speech segment.
-freq threshold
Sampling frequency (Hz, default=16000)
-48
Record in 48kHz, and down sampling to 16kHz.
-lv threslevel
Level threshold (0-32767, default=2000)
-zc zerocrossnum
Zero cross number threshold in a second (default=60)
-headmargin msec
Header margin of each speech segment (unit: milliseconds) (default: 400)
-tailmargin msec
Tail margin of each speech segment (unit: milliseconds) (default: 400)
-nostrip
Disable skipping of invalid zero samples (default: enabled)
-zmean
Enable zero mean subtraction to remove DC offset.
-raw
Output in RAW (no header) 16bit, big engian format (default: WAV)
-autopause
Automatically pause at each input end.
-loosesync
When connecting to multiple servers, avoid strict synchronization for server-side pause and resume command.
-rewind msec
By default, adintool will ignore speech input while being paused by server-side command. This may be a problem if an input begins while paused and then adintool resumes before the input ends. This option will send the last msec inputs before resuming.

EXAMPLE

Record microphone input only for the speech-detected part in "data.0000.wav", "data.0001.wav", ...:


    % adintool -in mic -out file -filename data

Split a large speech data "foobar.raw" to "foobar.1500.wav", "foobar.1501.wav", etc:


    % adintool -in file -out file -filename foobar
      -startid 1500
      (enter the input filename after startup)
      enter filename->foobar.raw
      ....

Send whole speech file to other host via tcp/ip socket:


  [sender]
    % adintool -in adinnet -out file -nosegment
  [receiver]
    % adintool -in file -out adinnet -server hostname
      -nosegment

Send microphone input to Julius running on other host:

(1) Transmit whole input, and let Julius execute
    speech detection and recognition:


  [Julius]
    % julius -C xxx.jconf ... -input adinnet
  [adintool]
    % adintool -in mic -out adinnet -server hostname
      -nosegment

(2) Detect speech segment at input client side
    (adintool), and transmit only the detected parts
    to Julius, and recognize them:


  [Julius]
    % julius -C xxx.jconf ... -input adinnet
  [adintool]
    % adintool -in mic -out adinnet -server hostname

SEE ALSO

julius(1), adinrec(1) Copyright (c) 1991-2007 Kawahara Lab., Kyoto University
Copyright (c) 2001-2007 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005-2007 Julius project team, Nagoya Institute of Technology

AUTHORS

LEE Akinobu (Nagoya Institute of Technology, Japan)
contact: julius-info at lists.sourceforge.jp

LICENSE

Same as Julius.