|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.isi.karma.modeling.semantictypes.SemanticTypeUtil
public class SemanticTypeUtil
This class provides various utility methods that can be used by the semantic typing module.
Constructor Summary | |
---|---|
SemanticTypeUtil()
|
Method Summary | |
---|---|
static java.util.ArrayList<java.lang.String> |
getTrainingExamples(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.HNodePath path)
Prepares and returns a collection of training examples to be used in semantic types training. |
static void |
identifyOutliers(edu.isi.karma.rep.Worksheet worksheet,
java.lang.String predictedType,
edu.isi.karma.rep.HNodePath path,
edu.isi.karma.rep.metadata.Tag outlierTag,
java.util.Map<CRFModelHandler.ColumnFeature,java.util.Collection<java.lang.String>> columnFeatures,
CRFModelHandler crfModelHandler)
Identifies the outlier nodes (table cells) for a given column. |
static boolean |
populateSemanticTypesUsingCRF(edu.isi.karma.rep.Worksheet worksheet,
edu.isi.karma.rep.metadata.Tag outlierTag,
CRFModelHandler crfModelHandler)
This method predicts semantic types for all the columns in a worksheet using CRF modeling technique developed by Aman Goel. |
static java.lang.String |
removeNamespace(java.lang.String uri)
Removes the namespace from a given URI. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SemanticTypeUtil()
Method Detail |
---|
public static java.util.ArrayList<java.lang.String> getTrainingExamples(edu.isi.karma.rep.Worksheet worksheet, edu.isi.karma.rep.HNodePath path)
worksheet
- The target worksheetpath
- Path to the target column
public static boolean populateSemanticTypesUsingCRF(edu.isi.karma.rep.Worksheet worksheet, edu.isi.karma.rep.metadata.Tag outlierTag, CRFModelHandler crfModelHandler)
worksheet
- The target worksheetoutlierTag
- Tag object that stores outlier nodescrfModelHandler
- The CRF Model Handler to use
public static void identifyOutliers(edu.isi.karma.rep.Worksheet worksheet, java.lang.String predictedType, edu.isi.karma.rep.HNodePath path, edu.isi.karma.rep.metadata.Tag outlierTag, java.util.Map<CRFModelHandler.ColumnFeature,java.util.Collection<java.lang.String>> columnFeatures, CRFModelHandler crfModelHandler)
worksheet
- Target worksheetpredictedType
- Type which was user-assigned or predicted by the CRF model for
the given column. If the type for a given node is different
from the predictedType, it is tagged as outlier and it's id is
stored in the outlier tag objectpath
- Path to the given columnoutlierTag
- The outlier tag object which stores all the outlier node ids.columnFeatures
- Features such as column name, table name that are required by
the CRF Model to predict the semantic type for a node (table
cell)crfModelHandler
- public static java.lang.String removeNamespace(java.lang.String uri)
uri
- Input URI
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |