Main Page | Class List | Directories | File List | Class Members | File Members

s3_align.c File Reference

Engine for Sphinx 3 aligner. More...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <feat.h>
#include <strfuncs.h>
#include <s3types.h>
#include "mdef.h"
#include "tmat.h"
#include "dict.h"
#include "logs3.h"
#include "s3_align.h"

Classes

struct  pnode_s
struct  plink_s
struct  history_s
struct  snode_s
struct  slink_s

Defines

#define ACTIVE_LIST_SIZE_INCR   16380

Typedefs

typedef pnode_s pnode_t
typedef plink_s plink_t
typedef history_s history_t
typedef snode_s snode_t
typedef slink_s slink_t

Functions

int32 align_build_sent_hmm (char *wordstr, int insert_sil)
int32 align_destroy_sent_hmm (void)
void align_sen_active (uint8 *senlist, int32 n_sen)
int32 align_start_utt (char *uttid)
int32 align_frame (int32 *senscr)
int32 align_end_utt (align_stseg_t **stseg_out, align_phseg_t **phseg_out, align_wdseg_t **wdseg_out)
int32 align_init (mdef_t *_mdef, tmat_t *_tmat, dict_t *_dict, cmd_ln_t *_config, logmath_t *_logmath)
void align_free (void)


Detailed Description

Engine for Sphinx 3 aligner.


Define Documentation

#define ACTIVE_LIST_SIZE_INCR   16380
 


Typedef Documentation

typedef struct history_s history_t
 

Viterbi search history for each state at each time.

typedef struct plink_s plink_t
 

A may have links (transitions) to several successor or predecessor nodes. They are captured by a list of the following plink_t type.

typedef struct pnode_s pnode_t
 

Phone-level sentence HMM structures: pnode_t: nodes of phones forming sentence HMM. plink_t: a link between two pnode_t nodes. A phone node may have multiple successors and/or predecessors because of multiple alternative pronunciations for a word, as well as the presence of OPTIONAL filler words.

Assumptions:

  • No cycles in phone level sentence HMM.

typedef struct slink_s slink_t
 

typedef struct snode_s snode_t
 

State DAG structures similar to phone DAG structures.


Function Documentation

int32 align_build_sent_hmm char *  wordstr,
int  insert_sil
 

Build a sentence HMM for the given transcription (wordstr). A two-level DAG is built: phone-level and state-level.

  • <s> and </s> always added at the beginning and end of sentence to form an augmented transcription.
  • Optional <sil> and noise words added between words in the augmented transcription. wordstr must contain only the transcript; no extraneous stuff such as utterance-id. Phone-level HMM structure has replicated nodes to allow for different left and right context CI phones; hence, each pnode corresponds to a unique triphone in the sentence HMM. Return 0 if successful, <0 if any error (eg, OOV word encountered).
Parameters:
wordstr  In: Word transcript
insert_sil  In: Whether to insert silences/fillers

int32 align_destroy_sent_hmm void   ) 
 

int32 align_end_utt align_stseg_t **  stseg_out,
align_phseg_t **  phseg_out,
align_wdseg_t **  wdseg_out
 

All frames consumed. Trace back best Viterbi state sequence and dump it out.

Parameters:
stseg_out  Out: list of state segmentation
phseg_out  Out: list of phone segmentation
wdseg_out  Out: list of word segmentation

int32 align_frame int32 *  senscr  ) 
 

One frame of Viterbi time alignment.

Parameters:
senscr  In: array of senone scores this frame

void align_free void   ) 
 

int32 align_init mdef_t _mdef,
tmat_t _tmat,
dict_t _dict,
cmd_ln_t *  _config,
logmath_t *  _logmath
 

void align_sen_active uint8 *  senlist,
int32  n_sen
 

Flag the active senones.

Parameters:
senlist  Out: senlist[s] TRUE iff active in frame
n_sen  In: Size of senlist[] array

int32 align_start_utt char *  uttid  ) 
 

Start Viterbi alignment using the sentence HMM previously built. Assumes that each utterance will only be aligned once; state member variables initialized during sentence HMM building.


Generated on Sat Apr 11 00:02:32 2009 by  doxygen 1.3.9.1