compared with
Version 6 by laura.tolosi
on Dec 04, 2014 14:26.

Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (2)

View Page History

h2. Introduction
Edlin is a collection of machine learning algorithms comprising a large number of state-of-the-art methods for classification and sequence tagging. Even though at their core they are general machine-learning approaches (perceptrons, logistic regression), the implementation is optimized for NLP learning tasks: inputs are represented as sparse document-term matrices, parallel computation is used whenever possible (in order to deal with very large datasets), specific evaluation metrics such as Precision/Recall/F are being reported, appropriate feature selection methods are added in order to reduce dimensionality, etc.
* inputs are represented as sparse document-term matrices
* parallel computation is used whenever possible (in order to deal with very large datasets)
* specific evaluation metrics such as Precision/Recall/F are being reported
* appropriate feature selection methods are added in order to reduce dimensionality, etc.

Edlin consists of four sub-projects(Basics, Edlin-Wrapper, Mallet-Wrapper and Feature Extraction) and is closely bound to yet another in-house project, Doc-Classif API(a.k.a DAPI).