Digitization is the process of converting raw data/book into a digital format for XML, SGML, epubs and other electronic formats. Data conversion involves conversion of one form of digital data into another form. Data conversion is carried out mostly to meet the requirements of application interoperability or to leverage the capability of new features. Aster, a leader in digitization and data conversion process, transform various data formats to functional format, which will be compatible for the usage on web or other digital devices (such as epub, mobi, Nook etc). We convert hard copy- book, journal, magazine or newspaper in electronic formats such as PDF, word, MS Excel, TIFF, JPEG, QuarkXpress, InDesign, Frame Maker etc. and then into XML, SGML, HTML & other digital formats. These formats are validated based on the Document Type Definition (DTD) specified by the client. We have the expertise and capability to undertake huge and complex data conversion projects involving multiple file formats.
Our services include:
Optical Character Recognition (OCR):
Defining Meta Data:
With ever growing digital data, metadata is expected to become a very significant feature for structured content especially to find a specific required resource/data from digital library. It is basically used for search engines. It eases search of documents from the database when looked up in the search engines.
We capture the following in the body part,
- Section levels
- Emphasis – bold, Italics & underline
- Lists – numbered list, un-numbered list, bullet list and other list format.
- Inline & display equation captured using MathML.
- Tables are tagged according to the source alignment.
- Abbreviations, glossary.
- Cross reference links are given for tables, figures, notes, references, etc.
- Bibliographical references
- Article / book title
- Journal / book name
- Journal supplementary
- Volume / edition number
- Issue number
- ISSN / ISBN (number)
- Start & end page
Citation Tagging, Indexing, Referencing and Keywords coding:
As part of post scanning process, a scanned image is edited to create better quality image output. Usually, an illustration is captured in resolutions of 50 dpi, 300 dpi and 600 dpi. The imaging process includes:
- Edit image. Eliminate moire, bleed-through of ink from text, image, watermark, pen mark, stain such as ink and rust
- Create and add background to a picture
- Manipulate raw picture which involve masking, cloning and color correction
- Create shadows for image object
- Animate the manipulated picture with special coding