Main Page | Class List | Directories | File List | Class Members | File Members

s3_align.h File Reference

data structure for alignment More...

#include <logmath.h>
#include <s3types.h>

Go to the source code of this file.

Classes

struct  align_stseg_s
struct  align_phseg_s
struct  align_wdseg_s

Typedefs

typedef align_stseg_s align_stseg_t
typedef align_phseg_s align_phseg_t
typedef align_wdseg_s align_wdseg_t

Functions

int32 align_init (mdef_t *_mdef, tmat_t *_tmat, dict_t *_dict, cmd_ln_t *_config, logmath_t *_logmath)
void align_free (void)
int32 align_build_sent_hmm (char *transcript, int insert_sil)
int32 align_destroy_sent_hmm (void)
int32 align_start_utt (char *uttid)
void align_sen_active (uint8 *senlist, int32 n_sen)
int32 align_frame (int32 *senscr)
int32 align_end_utt (align_stseg_t **stseg, align_phseg_t **phseg, align_wdseg_t **wdseg)


Detailed Description

data structure for alignment


Typedef Documentation

typedef struct align_phseg_s align_phseg_t
 

Phone level segmentation/alignment information

typedef struct align_stseg_s align_stseg_t
 

State level segmentation/alignment; one entry per frame

typedef struct align_wdseg_s align_wdseg_t
 

Word level segmentation/alignment information


Function Documentation

int32 align_build_sent_hmm char *  wordstr,
int  insert_sil
 

Build a sentence HMM for the given transcription (wordstr). A two-level DAG is built: phone-level and state-level.

  • <s> and </s> always added at the beginning and end of sentence to form an augmented transcription.
  • Optional <sil> and noise words added between words in the augmented transcription. wordstr must contain only the transcript; no extraneous stuff such as utterance-id. Phone-level HMM structure has replicated nodes to allow for different left and right context CI phones; hence, each pnode corresponds to a unique triphone in the sentence HMM. Return 0 if successful, <0 if any error (eg, OOV word encountered).
Parameters:
wordstr  In: Word transcript
insert_sil  In: Whether to insert silences/fillers

int32 align_destroy_sent_hmm void   ) 
 

int32 align_end_utt align_stseg_t **  stseg_out,
align_phseg_t **  phseg_out,
align_wdseg_t **  wdseg_out
 

All frames consumed. Trace back best Viterbi state sequence and dump it out.

Parameters:
stseg_out  Out: list of state segmentation
phseg_out  Out: list of phone segmentation
wdseg_out  Out: list of word segmentation

int32 align_frame int32 *  senscr  ) 
 

One frame of Viterbi time alignment.

Parameters:
senscr  In: array of senone scores this frame

void align_free void   ) 
 

int32 align_init mdef_t _mdef,
tmat_t _tmat,
dict_t _dict,
cmd_ln_t *  _config,
logmath_t *  _logmath
 

void align_sen_active uint8 *  senlist,
int32  n_sen
 

Flag the active senones.

Parameters:
senlist  Out: senlist[s] TRUE iff active in frame
n_sen  In: Size of senlist[] array

int32 align_start_utt char *  uttid  ) 
 

Start Viterbi alignment using the sentence HMM previously built. Assumes that each utterance will only be aligned once; state member variables initialized during sentence HMM building.


Generated on Sat Apr 11 00:02:32 2009 by  doxygen 1.3.9.1