h2. Creating documents from texts and URLs
* Use the {{CorporaAPI}} methods to create a document from:
** text - use the one receiving a string content parameter and a Boolean parameter denoting if the text is actually a markup.
** URL - use the one receiving an URL parameter and a string parameter for the encoding to be used.
{code:java}
// ----
// create documents from texts and from URL
// ----
String content = "Blair and Bush ? are they doing the right thing for Iraq, America,"
+ " Europe, the Earth... for civilization... "
+ "or just guided by their blinded eyes are in favor of the big coporations:"
+ "enter here new unrecognized corporations with a clue suffix:"
+ "MicroZoftRR Inc.";
// this constructor takes document content as String and a boolean parameter
// denoting whether the document content contains any markup tags
KIMDocument kdocFromText = apiCorpora.createDocument(content, false);
System.out.println("Document created from TEXT ...\n[ DOCUMENT CONTENT BEGIN ]\n" +
kdocFromText.getContent() + "\n[ DOCUMENT CONTENT END ]");
URL url = new URL("http://www.ontotext.com/kim");
// this constructor takes URL and encoding as parameters
KIMDocument kdocFromUrl = apiCorpora.createDocument(url, "UTF-8");
System.out.println("Document created from URL ...\n[ DOCUMENT CONTENT BEGIN ]\n" +
kdocFromUrl.getContent() + "\n[ DOCUMENT CONTENT END ]");
// ----
{code}
h2. Storing and loading documents
* Use {{DocumentReporsitoryAPI}} for storing documents.
* Use {{addDocument()}} method for storing a specific document.
When the document is stored, it is assigned a persistence ID. You can later use the ID to load this document.
{code:java}
// ----------------------------------------------------------------------------------
// store and load a document
// ----------------------------------------------------------------------------------
apiDR.addDocument(kdocFromText);
apiDR.addDocument(kdocFromUrl);
System.out.println("Documents stored.");
System.out.println("All documents in store: " + apiDR.getDocumentCount(new DocumentQuery()));
String loadedContentText = null;
String loadedContentUrl = null;
KIMDocument kdocFromText2 = apiDR.loadDocument(kdocFromText.getDocumentId());
loadedContentText = kdocFromText2.getContent();
KIMDocument kdocFromUrl2 = apiDR.loadDocument(kdocFromUrl.getDocumentId());
loadedContentUrl = kdocFromUrl2.getContent();
System.out.println("Documents loaded.");
// check if stored and loaded objects have the same content
System.out.println("Documents compare : stored - loaded.");
System.out.println(" - TEXT Documents are the same: " + String.valueOf(kdocFromText.getContent().equals(loadedContentText)));
System.out.println(" - URL Documents are the same : " + String.valueOf(kdocFromUrl.getContent().equals(loadedContentUrl)));
// ----------------------------------------------------------------------------------
{code}
h2. Loading documents from persistence
Use {{getDocumentIds()}} to get documents IDs, when you need to browse documents, based on some search criteria.
This method provides a filter, corresponding to these search criteria. Creating filters is the subject of the [Searching for Documents|07. Searching for documents] scenario. Here, we focus on how to load documents. When you have the ID list, it is possible to go through it and load the required documents from persistence.
{code:java}
// ----------------------------------------------------------------------------------
// load documents from persistence
// ----------------------------------------------------------------------------------
System.out.println("Loading documents from persistance (restricted by date 2000/01/01) ...");
DocumentQueryResult listDocIDs = null; // just pass through for the
// example
try {
DocumentQuery query = new DocumentQuery();
query.setTimeIntervalStartDate(new Date("2000/01/01"));
listDocIDs = apiDR.getDocumentIds(query);
} catch (Exception ex1) {
System.out.println("Can NOT get the docIds!!!");
return;
}
int numDocIDs = listDocIDs.size();
int numReadDocs = 0;
System.out.println("Documents Found: " + numDocIDs);
for (int i = 0; i < numDocIDs; i++) {
long docID = listDocIDs.get(i).getDocumentId();
try {
KIMDocument kdoc = apiDR.loadDocument(docID);
System.out.println(" - " + kdoc.getFeatures().get("TITLE"));
numReadDocs += 1;
} catch (Exception ex) {
System.out.println(" - " + "Can NOT load a doc with docId=" + docID + "!!!");
continue;
}
}
System.out.println("Documents Successfully Read: " + numReadDocs);
// ----------------------------------------------------------------------------------
{code}
h2. Synchronizing documents
Use {{syncDocument()}} to synchronize a document.
If the document already exists, it is updated. If not - it is added to the persistence and indexed.
{code:java}
// ----------------------------------------------------------------------------------
// synchronize documents
// ----------------------------------------------------------------------------------
apiDR.syncDocument(kdocFromText);
apiDR.syncDocument(kdocFromUrl);
System.out.println("Documents synchronized.");
// ----------------------------------------------------------------------------------
{code}
h2. Deleting documents
To remove the document from the persistence repository, you need to pass its ID as a parameter to the {{deleteDocument()}} method.
{code:java}
// ----------------------------------------------------------------------------------
// delete documents
// ----------------------------------------------------------------------------------
apiDR.deleteDocument(kdocFromText.getDocumentId());
apiDR.deleteDocument(kdocFromUrl.getDocumentId());
System.out.println("Documents deleted from persistance.");
System.out.println("All documents in store: " + apiDR.getDocumentCount(new DocumentQuery()));
// ----------------------------------------------------------------------------------
{code}
* Use the {{CorporaAPI}} methods to create a document from:
** text - use the one receiving a string content parameter and a Boolean parameter denoting if the text is actually a markup.
** URL - use the one receiving an URL parameter and a string parameter for the encoding to be used.
{code:java}
// ----
// create documents from texts and from URL
// ----
String content = "Blair and Bush ? are they doing the right thing for Iraq, America,"
+ " Europe, the Earth... for civilization... "
+ "or just guided by their blinded eyes are in favor of the big coporations:"
+ "enter here new unrecognized corporations with a clue suffix:"
+ "MicroZoftRR Inc.";
// this constructor takes document content as String and a boolean parameter
// denoting whether the document content contains any markup tags
KIMDocument kdocFromText = apiCorpora.createDocument(content, false);
System.out.println("Document created from TEXT ...\n[ DOCUMENT CONTENT BEGIN ]\n" +
kdocFromText.getContent() + "\n[ DOCUMENT CONTENT END ]");
URL url = new URL("http://www.ontotext.com/kim");
// this constructor takes URL and encoding as parameters
KIMDocument kdocFromUrl = apiCorpora.createDocument(url, "UTF-8");
System.out.println("Document created from URL ...\n[ DOCUMENT CONTENT BEGIN ]\n" +
kdocFromUrl.getContent() + "\n[ DOCUMENT CONTENT END ]");
// ----
{code}
h2. Storing and loading documents
* Use {{DocumentReporsitoryAPI}} for storing documents.
* Use {{addDocument()}} method for storing a specific document.
When the document is stored, it is assigned a persistence ID. You can later use the ID to load this document.
{code:java}
// ----------------------------------------------------------------------------------
// store and load a document
// ----------------------------------------------------------------------------------
apiDR.addDocument(kdocFromText);
apiDR.addDocument(kdocFromUrl);
System.out.println("Documents stored.");
System.out.println("All documents in store: " + apiDR.getDocumentCount(new DocumentQuery()));
String loadedContentText = null;
String loadedContentUrl = null;
KIMDocument kdocFromText2 = apiDR.loadDocument(kdocFromText.getDocumentId());
loadedContentText = kdocFromText2.getContent();
KIMDocument kdocFromUrl2 = apiDR.loadDocument(kdocFromUrl.getDocumentId());
loadedContentUrl = kdocFromUrl2.getContent();
System.out.println("Documents loaded.");
// check if stored and loaded objects have the same content
System.out.println("Documents compare : stored - loaded.");
System.out.println(" - TEXT Documents are the same: " + String.valueOf(kdocFromText.getContent().equals(loadedContentText)));
System.out.println(" - URL Documents are the same : " + String.valueOf(kdocFromUrl.getContent().equals(loadedContentUrl)));
// ----------------------------------------------------------------------------------
{code}
h2. Loading documents from persistence
Use {{getDocumentIds()}} to get documents IDs, when you need to browse documents, based on some search criteria.
This method provides a filter, corresponding to these search criteria. Creating filters is the subject of the [Searching for Documents|07. Searching for documents] scenario. Here, we focus on how to load documents. When you have the ID list, it is possible to go through it and load the required documents from persistence.
{code:java}
// ----------------------------------------------------------------------------------
// load documents from persistence
// ----------------------------------------------------------------------------------
System.out.println("Loading documents from persistance (restricted by date 2000/01/01) ...");
DocumentQueryResult listDocIDs = null; // just pass through for the
// example
try {
DocumentQuery query = new DocumentQuery();
query.setTimeIntervalStartDate(new Date("2000/01/01"));
listDocIDs = apiDR.getDocumentIds(query);
} catch (Exception ex1) {
System.out.println("Can NOT get the docIds!!!");
return;
}
int numDocIDs = listDocIDs.size();
int numReadDocs = 0;
System.out.println("Documents Found: " + numDocIDs);
for (int i = 0; i < numDocIDs; i++) {
long docID = listDocIDs.get(i).getDocumentId();
try {
KIMDocument kdoc = apiDR.loadDocument(docID);
System.out.println(" - " + kdoc.getFeatures().get("TITLE"));
numReadDocs += 1;
} catch (Exception ex) {
System.out.println(" - " + "Can NOT load a doc with docId=" + docID + "!!!");
continue;
}
}
System.out.println("Documents Successfully Read: " + numReadDocs);
// ----------------------------------------------------------------------------------
{code}
h2. Synchronizing documents
Use {{syncDocument()}} to synchronize a document.
If the document already exists, it is updated. If not - it is added to the persistence and indexed.
{code:java}
// ----------------------------------------------------------------------------------
// synchronize documents
// ----------------------------------------------------------------------------------
apiDR.syncDocument(kdocFromText);
apiDR.syncDocument(kdocFromUrl);
System.out.println("Documents synchronized.");
// ----------------------------------------------------------------------------------
{code}
h2. Deleting documents
To remove the document from the persistence repository, you need to pass its ID as a parameter to the {{deleteDocument()}} method.
{code:java}
// ----------------------------------------------------------------------------------
// delete documents
// ----------------------------------------------------------------------------------
apiDR.deleteDocument(kdocFromText.getDocumentId());
apiDR.deleteDocument(kdocFromUrl.getDocumentId());
System.out.println("Documents deleted from persistance.");
System.out.println("All documents in store: " + apiDR.getDocumentCount(new DocumentQuery()));
// ----------------------------------------------------------------------------------
{code}