Orcus Documentation Release 0.16

Kohei Yoshida

Sep 24, 2021

CONTENTS

1 Overview 3

2 C++ API 37

3 Python API 117

4 CLI 127

5 Notes 135

6 Indices and tables 147

Index 149

i ii Orcus Documentation, Release 0.16

Orcus is a library that provides a collection of standalone file processing filters and utilities. It was originally focused on providing filters for spreadsheet documents, but filters for other types of documents have been added tothemix. Contents:

CONTENTS 1 Orcus Documentation, Release 0.16

2 CONTENTS CHAPTER ONE

OVERVIEW

1.1 Composition of the library

The primary goal of the orcus library is to provide a framework to import the contents of documents stored in various spreadsheet or spreadsheet-like formats. The library also provides several low-level parsers that can be used inde- pendently of the spreadsheet-related features if so desired. In addition, the library also provides support for some hierarchical documents, such as JSON and YAML, which were a later addition to the library. You can use this library either through its C++ API, Python API, or CLI. However, not all three methods equally expose all features of the library, and the C++ API is more complete than the other two. The library is physically split into four parts: 1. the parser part that provides the aforementioned low-level parsers, 2. the filter part that providers higher level import filters for spreadsheet and hierarchical documents that internally use the low-level parsers, 3. the spreadsheet document model part that includes the document model suitable for storing spreadsheet document contents, and 4. CLI for loading and converting spreadsheet and hierarchical documents. If you need to just use the parser part of the library, you need to only link against the liborcus-parser library file. If you need to use the import filter part, link againt boththe liborcus-parser and the liborcus libraries. Likewise, if you need to use the spreadsheet document model part, link against the aforementioned two plus the liborcus-spreadsheet-model library. Also note that the spreadsheet document model part has additional dependency on the ixion library for handling formula re-calculations on document load.

1.2 Loading spreadsheet documents

The orcus library’s primary aim is to provide a framework to import the contents of documents stored in various spreadsheet, or spreadsheet-like formats. It supports two primary use cases. The first use case is where the client program does not have its own document model, but needs to import data from a spreadsheet-like document file and access its content without implementing its own document store from scratch. In this particular use case, you can simply use the document class to get it populated, and access its content through its API afterward. The second use case, which is a bit more advanced, is where the client program already has its own internal document model, and needs to use orcus to populate its document model. In this particular use case, you can implement your own set of classes that support necessary interfaces, and pass that to the orcus import filter.

3 Orcus Documentation, Release 0.16

For each document type that orcus supports, there is a top-level import filter class that serves as an entry point for loading the content of a document you wish to load. You don’t pass your document to this filter directly; instead, you wrap your document with what we call an import factory, then pass this factory instance to the loader. This import factory is then required to implement necessary interfaces that the filter class uses in order for it to pass data tothe document as the file is getting parsed. When using orcus’s own document model, you can simply use orcus’s own import factory implementation to wrap its document. When using your own document model, on the other hand, you’ll need to implement your own set of interface classes to wrap your document with. The following sections describe how to load a spreadsheet document by using 1) orcus’s own spreadsheet document class, and 2) a user-defined custom docuemnt class.

1.2.1 Use orcus’s spreadsheet document class

If you want to use orcus’ document as your document store instead, then you can use the import_factory class that orcus provides which already implements all necessary interfaces. The example code shown below illustrates how to do this: #include #include #include

#include #include using namespace orcus; int main() { // Instantiate a document, and wrap it with a factory. spreadsheet::document doc; spreadsheet::import_factory factory(doc);

// Pass the factory to the document loader, and read the content from a file // to populate the document. orcus_ods loader(&factory); loader.read_file("/path/to/document.ods");

// Now that the document is fully populated, access its content. const ixion::model_context& model= doc.get_model_context();

// Read the header row and print its content.

ixion::abs_address_t pos(0,0,0); // Set the cell position to A1. ixion::string_id_t str_id= model.get_string_identifier(pos);

const std::string*s= model.get_string(str_id); assert(s); std::cout<<"A1:"<<*s<< std::endl;

pos.column=1; // Move to B1 str_id= model.get_string_identifier(pos); (continues on next page)

4 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) s= model.get_string(str_id); assert(s); std::cout<<"B1:"<<*s<< std::endl;

pos.column=2; // Move to C1 str_id= model.get_string_identifier(pos); s= model.get_string(str_id); assert(s); std::cout<<"C1:"<<*s<< std::endl;

return EXIT_SUCCESS; }

This example code loads a file saved in the Open Document Spreadsheet format. It consists of the following content on its first sheet.

While it is not clear from this screenshot, cell C2 contains the formula CONCATENATE(A2, ” “, B2) to concatenate the content of A2 and B2 with a space between them. Cells C3 through C7 also contain similar formula expressions. Let’s walk through this code step by step. First, we need to instantiate the document store. Here we are using the concrete document class available in orcus. Then immediately pass this document to the import_factory instance also from orcus: // Instantiate a document, and wrap it with a factory. spreadsheet::document doc; spreadsheet::import_factory factory(doc);

The next step is to create the loader instance and pass the factory to it: // Pass the factory to the document loader, and read the content from a file // to populate the document. orcus_ods loader(&factory);

In this example we are using the orcus_ods filter class because the document we are loading is of Open Document Spreadsheet type, but the process is the same for other document types, the only difference being the name of the class. Once the filter object is constructed, we’ll simply load the file by calling its read_file() method and passing the path to the file as its argument: loader.read_file("/path/to/document.ods");

Once this call returns, the document has been fully populated. What the rest of the code does is access the content of the first row of the first sheet of the document. First, you need to get a reference to the internal cell value storethatwe call model context:

1.2. Loading spreadsheet documents 5 Orcus Documentation, Release 0.16

const ixion::model_context& model= doc.get_model_context();

Since the content of cell A1 is a string, to get the value you need to first get the ID of the string: ixion::abs_address_t pos(0,0,0); // Set the cell position to A1. ixion::string_id_t str_id= model.get_string_identifier(pos);

Once you have the ID of the string, you can pass that to the model to get the actual string value and print it to the standard output: const std::string*s= model.get_string(str_id); assert(s); std::cout<<"A1:"<<*s<< std::endl;

Here we do assume that the string value exists for the given ID. In case you pass a string ID value to the get_string() method and there isn’t a string value associated with it, you’ll get a null pointer instead. The reason you need to take this 2-step process to get a string value is because all the string values stored in the cells are pooled at the document model level, and the cells themselves only store the ID values. You may also have noticed that the types surrounding the ixion::model_context class are all in the ixion names- pace. It is because orcus’ own document class uses the formula engine from the ixion library in order to calculate the results of the formula cells inside the document, and the formula engine requires all cell values to be stored in the ixion::model_context instance.

Note: The document class in orcus uses the formula engine from the ixion library to calculate the results of the formula cells stored in the document.

The rest of the code basically repeats the same process for cells B1 and C1: pos.column=1; // Move to B1 str_id= model.get_string_identifier(pos); s= model.get_string(str_id); assert(s); std::cout<<"B1:"<<*s<< std::endl; pos.column=2; // Move to C1 str_id= model.get_string_identifier(pos); s= model.get_string(str_id); assert(s); std::cout<<"C1:"<<*s<< std::endl;

You will see the following output when you compile and run this code: A1: Number B1: String C1: Formula

Accessing the numeric cell values are a bit simpler since the values are stored directly with the cells. Using the document from the above example code, the following code: for (spreadsheet::row_t row=1; row<=6;++row) { ixion::abs_address_t pos(0, row,0); (continues on next page)

6 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) double value= model.get_numeric_value(pos); std::cout<<"A"<< (pos.row+1)<<":"<< value<< std::endl; } will access the cells from A2 through A7 and print out their numeric values. You should see the following output when you run this code block: A2: 1 A3: 2 A4: 3 A5: 4 A6: 5 A7: 6

It’s a bit more complex to handle formula cells. Since each formula cell contains two things: 1) the formula expression which is stored as tokens internally, and 2) the cached result of the formula. The following code illustrates how to retrieve the cached formula results of cells C2 through C7: for (spreadsheet::row_t row=1; row<=6;++row) { ixion::abs_address_t pos(0, row,2); // Column C const ixion::formula_cell* fc= model.get_formula_cell(pos); assert(fc);

// Get the formula cell results. const ixion::formula_result& result= fc->get_result_cache();

// We already know the result is a string. ixion::string_id_t sid= result.get_string(); const std::string*s= model.get_string(sid); assert(s); std::cout<<"C"<< (pos.row+1)<<":"<<*s<< std::endl; }

For each cell, this code first accesses the stored formula cell instance, get a reference to its cached result, thenobtain its string result value to print it out to the standard output. Running this block of code will yield the following output: C2: 1 Andy C3: 2 Bruce C4: 3 Charlie C5: 4 David C6: 5 Edward C7: 6 Frank

Warning: In production code, you should probabaly check the formula cell pointer which may be null in case the cell at the specified position is not a formula cell.

1.2. Loading spreadsheet documents 7 Orcus Documentation, Release 0.16

1.2.2 Use a user-defined custom document class

In this section we will demonstrate how you can use orcus to populate your own custom document model by imple- menting your own set of interface classes and passing it to the orcus import filter. The first example code shown below is the absolute minimum that you need to implement in order for the orcus filter to function properly: #include #include

#include using namespace std; using namespace orcus::spreadsheet; using orcus::orcus_ods; class my_empty_import_factory : public iface::import_factory { public: virtual iface::import_sheet* append_sheet( sheet_t sheet_index, const char* sheet_name, size_t sheet_name_length) override { cout<<"append_sheet: sheet index:"<< sheet_index <<"; sheet name:"<< string(sheet_name, sheet_name_length) << endl; return nullptr; }

virtual iface::import_sheet* get_sheet( const char* sheet_name, size_t sheet_name_length) override { cout<<"get_sheet: sheet name:" << string(sheet_name, sheet_name_length)<< endl; return nullptr; }

virtual iface::import_sheet* get_sheet(sheet_t sheet_index) override { cout<<"get_sheet: sheet index:"<< sheet_index<< endl; return nullptr; }

virtual void finalize() override {} }; int main() { my_empty_import_factory factory; orcus_ods loader(&factory); loader.read_file("/path/to/multi-sheets.ods");

return EXIT_SUCCESS; }

Just like the example we used in the previous section, we are also loading a document saved in the Open Document

8 Chapter 1. Overview Orcus Documentation, Release 0.16

Spreadsheet format via orcus_ods. The document being loaded is named multi-sheets.ods, and contains three sheets which are are named ‘1st Sheet’, ‘2nd Sheet’, and ‘3rd Sheet’ in this exact order. When you compile and execute the above code, you should get the following output: append_sheet: sheet index: 0; sheet name: 1st Sheet append_sheet: sheet index: 1; sheet name: 2nd Sheet append_sheet: sheet index: 2; sheet name: 3rd Sheet

One primary role the import factory plays is to provide the orcus import filter with the ability to create and insert a new sheet to the document. As illustrated in the above code, it also provides access to exist- ing sheets by its name or its position. Every import factory implementation must be a derived class of the orcus::spreadsheet::iface::import_factory interface base class. At a minimum, it must implement • the append_sheet() method which inserts a new sheet and return access to it, • two variants of the get_sheet() method which returns access to an existing sheet, and • the finalize() method which gets called exactly once at the very end of the import, to give the implementation a chance to perform post-import tasks. in order for the code to be buildable. Now, since all of the sheet accessor methods return null pointers in this code, the import filter has no way of populating the sheet data. To actually receive the sheet data from the import filter, youmust have these methods return valid pointers to sheet accessors. The next example shows how that can be done.

Implement sheet interface

In this section we will expand on the code in the previous section to implement the sheet accessor interface, in order to receive cell values in each individual sheet. In this example, we will define a structure to hold a cell value, and store them in a 2-dimensional array for each sheet. First, let’s define the cell value structure: enum class cell_value_type { empty, numeric, string}; struct cell_value { cell_value_type type;

union { size_t index; double f; };

cell_value(): type(cell_value_type::empty){} };

As we will be handling only three cell types i.e. empty, numeric, or string cell type, this structure will work just fine. Next, we’ll define a sheet class called my_sheet that stores the cell values in a 2-dimensional array, and implements all required interfaces as a child class of import_sheet. At a minimum, the sheet accessor class must implement the following virtual methods to satisfy the interface require- ments of import_sheet. • set_auto() - This is a setter method for a cell whose type is undetermined. The implementor must determine the value type of this cell, from the raw string value of the cell. This method is used when loading a CSV document, for instance.

1.2. Loading spreadsheet documents 9 Orcus Documentation, Release 0.16

• set_string() - This is a setter method for a cell that stores a string value. All cell string values are expectd to be pooled for the entire document, and this method only receives a string index into a centrally-managed string table. The document model is expected to implement a central string table that can translate an index into its actual string value. • set_value() - This is a setter method for a cell that stores a numeric value. • set_bool() - This is a setter method for a cell that stores a boolean value. Note that not all format types use this method, as some formats store boolean values as numeric values. • set_date_time() - This is a setter method for a cell that stores a date time value. As with boolean value type, some format types may not use this method as they store date time values as numeric values, typically as days since epoch. • set_format() - This is a setter method for applying cell formats. Just like the string values, cell format prop- erties are expected to be stored in a document-wide cell format properties table, and this method only receives an index into the table. • get_sheet_size() - This method is expected to return the dimension of the sheet which the loader may need in some operations. For now, we’ll only implement set_string(), set_value(), and get_sheet_size(), and leave the rest empty. Here is the actual code for class my_sheet: class my_sheet : public iface::import_sheet { cell_value m_cells[100][1000]; range_size_t m_sheet_size; sheet_t m_sheet_index; public: my_sheet(sheet_t sheet_index): m_sheet_index(sheet_index) { m_sheet_size.rows= 1000; m_sheet_size.columns= 100; }

virtual void set_string(row_t row, col_t col, size_t sindex) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): string index ="<< sindex<< endl;

m_cells[col][row].type= cell_value_type::string; m_cells[col][row].index= sindex; }

virtual void set_value(row_t row, col_t col, double value) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): value ="<< value<< endl;

m_cells[col][row].type= cell_value_type::numeric; m_cells[col][row].f= value; }

(continues on next page)

10 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) virtual range_size_t get_sheet_size() const override { return m_sheet_size; }

// We don't implement these methods for now. virtual void set_auto(row_t row, col_t col, const char* p, size_t n) override {} virtual void set_bool(row_t row, col_t col, bool value) override {} virtual void set_date_time(row_t row, col_t col, int year, int month, int day, int␣

˓→hour, int minute, double second) override {} virtual void set_format(row_t row, col_t col, size_t xf_index) override {} virtual void set_format(row_t row_start, col_t col_start, row_t row_end, col_t col_

˓→end, size_t xf_index) override {} };

Note that this class receives its sheet index value from the caller upon instantiation. A sheet index is a 0-based value and represents its position within the sheet collection. Finally, we will modify the my_import_factory class to store and manage a collection of my_sheet instances and to return the pointer value to a correct sheet accessor instance as needed. class my_import_factory : public iface::import_factory { std::vector> m_sheets; public: virtual iface::import_sheet* append_sheet( sheet_t sheet_index, const char* sheet_name, size_t sheet_name_length) override { m_sheets.push_back(std::make_unique(m_sheets.size())); return m_sheets.back().get(); }

virtual iface::import_sheet* get_sheet( const char* sheet_name, size_t sheet_name_length) override { // TODO : implement this. return nullptr; }

virtual iface::import_sheet* get_sheet(sheet_t sheet_index) override { sheet_t sheet_count= m_sheets.size(); return sheet_index< sheet_count? m_sheets[sheet_index].get(): nullptr; }

virtual void finalize() override {} };

Let’s put it all together and run this code: #include #include (continues on next page)

1.2. Loading spreadsheet documents 11 Orcus Documentation, Release 0.16

(continued from previous page)

#include #include using namespace std; using namespace orcus::spreadsheet; using orcus::orcus_ods; enum class cell_value_type { empty, numeric, string}; struct cell_value { cell_value_type type;

union { size_t index; double f; };

cell_value(): type(cell_value_type::empty){} }; class my_sheet : public iface::import_sheet { cell_value m_cells[100][1000]; range_size_t m_sheet_size; sheet_t m_sheet_index; public: my_sheet(sheet_t sheet_index): m_sheet_index(sheet_index) { m_sheet_size.rows= 1000; m_sheet_size.columns= 100; }

virtual void set_string(row_t row, col_t col, size_t sindex) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): string index ="<< sindex<< endl;

m_cells[col][row].type= cell_value_type::string; m_cells[col][row].index= sindex; }

virtual void set_value(row_t row, col_t col, double value) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): value ="<< value<< endl;

m_cells[col][row].type= cell_value_type::numeric; (continues on next page)

12 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) m_cells[col][row].f= value; }

virtual range_size_t get_sheet_size() const override { return m_sheet_size; }

// We don't implement these methods for now. virtual void set_auto(row_t row, col_t col, const char* p, size_t n) override {} virtual void set_bool(row_t row, col_t col, bool value) override {} virtual void set_date_time(row_t row, col_t col, int year, int month, int day, int␣

˓→hour, int minute, double second) override {} virtual void set_format(row_t row, col_t col, size_t xf_index) override {} virtual void set_format(row_t row_start, col_t col_start, row_t row_end, col_t col_

˓→end, size_t xf_index) override {} }; class my_import_factory : public iface::import_factory { std::vector> m_sheets; public: virtual iface::import_sheet* append_sheet( sheet_t sheet_index, const char* sheet_name, size_t sheet_name_length) override { m_sheets.push_back(std::make_unique(m_sheets.size())); return m_sheets.back().get(); }

virtual iface::import_sheet* get_sheet( const char* sheet_name, size_t sheet_name_length) override { // TODO : implement this. return nullptr; }

virtual iface::import_sheet* get_sheet(sheet_t sheet_index) override { sheet_t sheet_count= m_sheets.size(); return sheet_index< sheet_count? m_sheets[sheet_index].get(): nullptr; }

virtual void finalize() override {} }; int main() { my_import_factory factory; orcus_ods loader(&factory); loader.read_file("/path/to/multi-sheets.ods");

(continues on next page)

1.2. Loading spreadsheet documents 13 Orcus Documentation, Release 0.16

(continued from previous page) return EXIT_SUCCESS; }

We’ll be loading the same document we loaded in the previous example, but this time we will receive its cell values. Let’s go through each sheet one at a time. Data on the first sheet looks like this:

It consists of 4 columns, with each column having a header row followed by exactly ten rows of data. The first and forth columns contain numeric data, while the second and third columns contain string data. When you run the above code to load this sheet, you’ll get the following output: (sheet: 0; row: 0; col: 0): string index = 0 (sheet: 0; row: 0; col: 1): string index = 0 (sheet: 0; row: 0; col: 2): string index = 0 (sheet: 0; row: 0; col: 3): string index = 0 (sheet: 0; row: 1; col: 0): value = 1 (sheet: 0; row: 1; col: 1): string index = 0 (sheet: 0; row: 1; col: 2): string index = 0 (sheet: 0; row: 1; col: 3): value = 35 (sheet: 0; row: 2; col: 0): value = 2 (sheet: 0; row: 2; col: 1): string index = 0 (sheet: 0; row: 2; col: 2): string index = 0 (sheet: 0; row: 2; col: 3): value = 56 (sheet: 0; row: 3; col: 0): value = 3 (sheet: 0; row: 3; col: 1): string index = 0 (sheet: 0; row: 3; col: 2): string index = 0 (sheet: 0; row: 3; col: 3): value = 6 (sheet: 0; row: 4; col: 0): value = 4 (sheet: 0; row: 4; col: 1): string index = 0 (sheet: 0; row: 4; col: 2): string index = 0 (sheet: 0; row: 4; col: 3): value = 65 (sheet: 0; row: 5; col: 0): value = 5 (sheet: 0; row: 5; col: 1): string index = 0 (sheet: 0; row: 5; col: 2): string index = 0 (sheet: 0; row: 5; col: 3): value = 88 (continues on next page)

14 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) (sheet: 0; row: 6; col: 0): value = 6 (sheet: 0; row: 6; col: 1): string index = 0 (sheet: 0; row: 6; col: 2): string index = 0 (sheet: 0; row: 6; col: 3): value = 90 (sheet: 0; row: 7; col: 0): value = 7 (sheet: 0; row: 7; col: 1): string index = 0 (sheet: 0; row: 7; col: 2): string index = 0 (sheet: 0; row: 7; col: 3): value = 80 (sheet: 0; row: 8; col: 0): value = 8 (sheet: 0; row: 8; col: 1): string index = 0 (sheet: 0; row: 8; col: 2): string index = 0 (sheet: 0; row: 8; col: 3): value = 66 (sheet: 0; row: 9; col: 0): value = 9 (sheet: 0; row: 9; col: 1): string index = 0 (sheet: 0; row: 9; col: 2): string index = 0 (sheet: 0; row: 9; col: 3): value = 14 (sheet: 0; row: 10; col: 0): value = 10 (sheet: 0; row: 10; col: 1): string index = 0 (sheet: 0; row: 10; col: 2): string index = 0 (sheet: 0; row: 10; col: 3): value = 23

There is a couple of things worth pointing out. First, the cell data flows left to right first then top to bottom sec- ond. Second, for this particular sheet and for this particular format, implementing just the two setter methods, namely set_string() and set_value() are enough to receive all cell values. However, we are getting a string index value of 0 for all string cells. This is because orcus expects the backend document model to implement the shared strings interface which is responsible for providing correct string indices to the import filter, and we have not yet implemented one. Let’s fix that.

Implement shared strings interface

The first thing to do is define some types: using ss_type= std::deque; using ss_hash_type= std::unordered_map;

Here, we define ss_type to be the authoritative store for the shared string values. The string values will be stored as std::string type, and we use std::deque here to avoid re-allocation of internal buffers as the size of the container grows. Another type we define is ss_hash_type, which will be the hash map type for storing string-to-index mapping entries. Here, we are using pstring instead of std::string so that we can simply re-use the string values stored in the first container simply by pointing to their memory locations. The shared string interface is designed to handle both unformatted and formatted string values. The following two methods: • add() • append() are for unformatted string values. The add() method is used when passing a string value that may or may not already exist in the shared string pool. The append() method, on the other hand, is used only when the string value being passed is a brand-new string not yet stored in the string pool. When implementing the append() method, you may skip checking for the existance of the string value in the pool before inserting it. Both of these methods are expected to return a positive integer value as the index of the string being passed.

1.2. Loading spreadsheet documents 15 Orcus Documentation, Release 0.16

The following eight methods: • set_segment_bold() • set_segment_font() • set_segment_font_color() • set_segment_font_name() • set_segment_font_size() • set_segment_italic() • append_segment() • commit_segments() are for receiving formatted string values. Conceptually, a formatted string consists of a series of multiple string seg- ments, where each segment may have different formatting attributes applied to it. These set_segment_* methods are used to set the individual formatting attributes for the current string segment, and the string value for the current seg- ment is passed through the append_segment() call. The order in which the set_segment_* methods are called is not specified, and not all of them may be called, but they are guaranteed to be called beforethe append_segment() method gets called. The implementation should keep a buffer to store the formatting attributes for the current segment andap- ply each attribute to the buffer as one of the set_segment_* methods gets called. When the append_segment() gets called, the implementation should apply the formatting attirbute set currently in the buffer to the current segment, and reset the buffer for the next segment. When all of the string segments and their formatting attributes arepassed, commit_segments() gets called, signaling the implementation that now it’s time to commit the string to the document model. As we are going to ignore the formatting attributes in our current example, the following code will do: class my_shared_strings : public iface::import_shared_strings { ss_hash_type m_ss_hash; ss_type& m_ss; std::string m_current_string; public: my_shared_strings(ss_type& ss): m_ss(ss){}

virtual size_t add(const char* s, size_t n) override { pstring input(s, n);

auto it= m_ss_hash.find(input); if (it!= m_ss_hash.end()) // This string already exists in the pool. return it->second;

// This is a brand-new string. return append(s, n); }

virtual size_t append(const char* s, size_t n) override { size_t string_index= m_ss.size(); m_ss.emplace_back(s, n); (continues on next page)

16 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) m_ss_hash.emplace(pstring(s, n), string_index);

return string_index; }

// The following methods are for formatted text segments, which we ignore for now. virtual void set_segment_bold(bool b) override {} virtual void set_segment_font(size_t font_index) override {} virtual void set_segment_font_color(color_elem_t alpha, color_elem_t red, color_elem_

˓→t green, color_elem_t blue) override {} virtual void set_segment_font_name(const char* s, size_t n) override {} virtual void set_segment_font_size(double point) override {} virtual void set_segment_italic(bool b) override {}

virtual void append_segment(const char* s, size_t n) override { m_current_string+= std::string(s, n); }

virtual size_t commit_segments() override { size_t string_index= m_ss.size(); m_ss.push_back(std::move(m_current_string));

const std::string&s= m_ss.back(); orcus::pstring sv(s.data(), s.size()); m_ss_hash.emplace(sv, string_index);

return string_index; } };

Note that some import filters may use the append_segment() and commit_segments() combination even for un- formatted strings. Because of this, you still need to implement these two methods even if raw string values are all you care about. Note also that the container storing the string values is a reference. The source container will be owned by my_import_factory who will also be the owner of the my_shared_strings instance. Shown below is the modified version of my_import_factory that provides the shared string interface: class my_import_factory : public iface::import_factory { ss_type m_string_pool; // string pool to be shared everywhere. my_shared_strings m_shared_strings; std::vector> m_sheets;

public: my_import_factory(): m_shared_strings(m_string_pool){}

virtual iface::import_shared_strings* get_shared_strings() override { return &m_shared_strings; } (continues on next page)

1.2. Loading spreadsheet documents 17 Orcus Documentation, Release 0.16

(continued from previous page)

virtual iface::import_sheet* append_sheet( sheet_t sheet_index, const char* sheet_name, size_t sheet_name_length) override { // Pass the string pool to each sheet instance. m_sheets.push_back(std::make_unique(m_sheets.size(), m_string_pool)); return m_sheets.back().get(); }

virtual iface::import_sheet* get_sheet( const char* sheet_name, size_t sheet_name_length) override { // TODO : implement this. return nullptr; }

virtual iface::import_sheet* get_sheet(sheet_t sheet_index) override { sheet_t sheet_count= m_sheets.size(); return sheet_index< sheet_count? m_sheets[sheet_index].get(): nullptr; }

virtual void finalize() override {} };

The shared string store is also passed to each sheet instance, and we’ll use that to fetch the string values from their respective string indices. Let’s put this all together: #include #include

#include #include #include #include using namespace std; using namespace orcus::spreadsheet; using orcus::orcus_ods; using orcus::pstring; enum class cell_value_type { empty, numeric, string}; using ss_type= std::deque; using ss_hash_type= std::unordered_map; struct cell_value { cell_value_type type;

union (continues on next page)

18 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) { size_t index; double f; };

cell_value(): type(cell_value_type::empty){} }; class my_sheet : public iface::import_sheet { cell_value m_cells[100][1000]; range_size_t m_sheet_size; sheet_t m_sheet_index; const ss_type& m_string_pool; public: my_sheet(sheet_t sheet_index, const ss_type& string_pool): m_sheet_index(sheet_index), m_string_pool(string_pool) { m_sheet_size.rows= 1000; m_sheet_size.columns= 100; }

virtual void set_string(row_t row, col_t col, size_t sindex) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): string index ="<< sindex<<"("<< m_string_pool[sindex]<<")"<< endl;

m_cells[col][row].type= cell_value_type::string; m_cells[col][row].index= sindex; }

virtual void set_value(row_t row, col_t col, double value) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col<<

˓→"): value ="<< value<< endl;

m_cells[col][row].type= cell_value_type::numeric; m_cells[col][row].f= value; }

virtual range_size_t get_sheet_size() const override { return m_sheet_size; }

// We don't implement these methods for now. virtual void set_auto(row_t row, col_t col, const char* p, size_t n) override {} virtual void set_bool(row_t row, col_t col, bool value) override {} virtual void set_date_time(row_t row, col_t col, int year, int month, int day, int␣

˓→hour, int minute, double second) override {} (continues on next page)

1.2. Loading spreadsheet documents 19 Orcus Documentation, Release 0.16

(continued from previous page) virtual void set_format(row_t row, col_t col, size_t xf_index) override {} virtual void set_format( row_t row_start, col_t col_start, row_t row_end, col_t col_end, size_t xf_index)␣

˓→override {} }; class my_shared_strings : public iface::import_shared_strings { ss_hash_type m_ss_hash; ss_type& m_ss; std::string m_current_string; public: my_shared_strings(ss_type& ss): m_ss(ss){}

virtual size_t add(const char* s, size_t n) override { pstring input(s, n);

auto it= m_ss_hash.find(input); if (it!= m_ss_hash.end()) // This string already exists in the pool. return it->second;

// This is a brand-new string. return append(s, n); }

virtual size_t append(const char* s, size_t n) override { size_t string_index= m_ss.size(); m_ss.emplace_back(s, n); m_ss_hash.emplace(pstring(s, n), string_index);

return string_index; }

// The following methods are for formatted text segments, which we ignore for now. virtual void set_segment_bold(bool b) override {} virtual void set_segment_font(size_t font_index) override {} virtual void set_segment_font_color(color_elem_t alpha, color_elem_t red, color_elem_

˓→t green, color_elem_t blue) override {} virtual void set_segment_font_name(const char* s, size_t n) override {} virtual void set_segment_font_size(double point) override {} virtual void set_segment_italic(bool b) override {}

virtual void append_segment(const char* s, size_t n) override { m_current_string+= std::string(s, n); }

virtual size_t commit_segments() override (continues on next page)

20 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) { size_t string_index= m_ss.size(); m_ss.push_back(std::move(m_current_string));

const std::string&s= m_ss.back(); orcus::pstring sv(s.data(), s.size()); m_ss_hash.emplace(sv, string_index);

return string_index; } }; class my_import_factory : public iface::import_factory { ss_type m_string_pool; // string pool to be shared everywhere. my_shared_strings m_shared_strings; std::vector> m_sheets; public: my_import_factory(): m_shared_strings(m_string_pool){}

virtual iface::import_shared_strings* get_shared_strings() override { return &m_shared_strings; }

virtual iface::import_sheet* append_sheet( sheet_t sheet_index, const char* sheet_name, size_t sheet_name_length) override { // Pass the string pool to each sheet instance. m_sheets.push_back(std::make_unique(m_sheets.size(), m_string_pool)); return m_sheets.back().get(); }

virtual iface::import_sheet* get_sheet( const char* sheet_name, size_t sheet_name_length) override { // TODO : implement this. return nullptr; }

virtual iface::import_sheet* get_sheet(sheet_t sheet_index) override { sheet_t sheet_count= m_sheets.size(); return sheet_index< sheet_count? m_sheets[sheet_index].get(): nullptr; }

virtual void finalize() override {} }; int main() { (continues on next page)

1.2. Loading spreadsheet documents 21 Orcus Documentation, Release 0.16

(continued from previous page) my_import_factory factory; orcus_ods loader(&factory); loader.read_file("/path/to/multi-sheets.ods");

return EXIT_SUCCESS; }

The sheet class is largely unchanged except for one thing; it now takes a reference to the string pool and print the actual string value alongside the string index associated with it. When you execute this code, you’ll see the following output when loading the same sheet: (sheet: 0; row: 0; col: 0): string index = 0 (ID) (sheet: 0; row: 0; col: 1): string index = 1 (First Name) (sheet: 0; row: 0; col: 2): string index = 2 (Last Name) (sheet: 0; row: 0; col: 3): string index = 3 (Age) (sheet: 0; row: 1; col: 0): value = 1 (sheet: 0; row: 1; col: 1): string index = 5 (Thia) (sheet: 0; row: 1; col: 2): string index = 6 (Beauly) (sheet: 0; row: 1; col: 3): value = 35 (sheet: 0; row: 2; col: 0): value = 2 (sheet: 0; row: 2; col: 1): string index = 9 (Pepito) (sheet: 0; row: 2; col: 2): string index = 10 (Resun) (sheet: 0; row: 2; col: 3): value = 56 (sheet: 0; row: 3; col: 0): value = 3 (sheet: 0; row: 3; col: 1): string index = 13 (Emera) (sheet: 0; row: 3; col: 2): string index = 14 (Gravey) (sheet: 0; row: 3; col: 3): value = 6 (sheet: 0; row: 4; col: 0): value = 4 (sheet: 0; row: 4; col: 1): string index = 17 (Erinn) (sheet: 0; row: 4; col: 2): string index = 18 (Flucks) (sheet: 0; row: 4; col: 3): value = 65 (sheet: 0; row: 5; col: 0): value = 5 (sheet: 0; row: 5; col: 1): string index = 21 (Giusto) (sheet: 0; row: 5; col: 2): string index = 22 (Bambury) (sheet: 0; row: 5; col: 3): value = 88 (sheet: 0; row: 6; col: 0): value = 6 (sheet: 0; row: 6; col: 1): string index = 25 (Neall) (sheet: 0; row: 6; col: 2): string index = 26 (Scorton) (sheet: 0; row: 6; col: 3): value = 90 (sheet: 0; row: 7; col: 0): value = 7 (sheet: 0; row: 7; col: 1): string index = 29 (Ervin) (sheet: 0; row: 7; col: 2): string index = 30 (Foreman) (sheet: 0; row: 7; col: 3): value = 80 (sheet: 0; row: 8; col: 0): value = 8 (sheet: 0; row: 8; col: 1): string index = 33 (Shoshana) (sheet: 0; row: 8; col: 2): string index = 34 (Bohea) (sheet: 0; row: 8; col: 3): value = 66 (sheet: 0; row: 9; col: 0): value = 9 (sheet: 0; row: 9; col: 1): string index = 37 (Gladys) (sheet: 0; row: 9; col: 2): string index = 38 (Somner) (sheet: 0; row: 9; col: 3): value = 14 (sheet: 0; row: 10; col: 0): value = 10 (continues on next page)

22 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) (sheet: 0; row: 10; col: 1): string index = 41 (Ephraim) (sheet: 0; row: 10; col: 2): string index = 42 (Russell) (sheet: 0; row: 10; col: 3): value = 23

The string indices now increment nicely, and their respective string values look correct. Now, let’s turn our attention to the second sheet, which contains formulas. First, here is what the second sheet looks like:

It contains a simple table extending from A1 to C9. It consists of three columns and the first row is a header row. Cells in the the first and second columns contain simple numbers and the third column contains formulas that simplyadd the two numbers to the left of the same row. When loading this sheet using the last code we used above, you’ll see the following output: (sheet: 1; row: 0; col: 0): string index = 44 (X) (sheet: 1; row: 0; col: 1): string index = 45 (Y) (sheet: 1; row: 0; col: 2): string index = 46 (X + Y) (sheet: 1; row: 1; col: 0): value = 18 (sheet: 1; row: 1; col: 1): value = 79 (sheet: 1; row: 2; col: 0): value = 48 (sheet: 1; row: 2; col: 1): value = 55 (sheet: 1; row: 3; col: 0): value = 99 (sheet: 1; row: 3; col: 1): value = 35 (sheet: 1; row: 4; col: 0): value = 41 (sheet: 1; row: 4; col: 1): value = 69 (sheet: 1; row: 5; col: 0): value = 5 (sheet: 1; row: 5; col: 1): value = 18 (sheet: 1; row: 6; col: 0): value = 46 (sheet: 1; row: 6; col: 1): value = 69 (sheet: 1; row: 7; col: 0): value = 36 (sheet: 1; row: 7; col: 1): value = 67 (sheet: 1; row: 8; col: 0): value = 78 (sheet: 1; row: 8; col: 1): value = 2

Everything looks fine except that the formula cells in C2:C9 are not loaded at all. This is because, in ordertoreceive formula cell data, you must implement the required import_formula interface, which we will cover in the next section.

1.2. Loading spreadsheet documents 23 Orcus Documentation, Release 0.16

Implement formula interface

In this section we will extend the code from the previous section in order to receive and process formula cell values from the sheet. We will need to make quite a few changes. Let’s go over this one thing at a time. First, we are adding a new cell value type formula: enum class cell_value_type { empty, numeric, string, formula}; // adding a formula type␣

˓→here

which should not come as a surprise. We are not making any change to the cell_value struct itself, but we are re-using its index member for a formula cell value such that, if the cell stores a formula, the index will refer to its actual formula data which will be stored in a separate data store, much like how strings are stored externally and referenced by their indices in the cell_value instances. We are also adding a branch-new class called cell_grid, to add an extra layer over the raw cell value array: class cell_grid { cell_value m_cells[100][1000]; public:

cell_value& operator()(row_t row, col_t col) { return m_cells[col][row]; } };

Each sheet instance will own one instance of cell_grid, and the formula interface class instance will hold a reference to it and use it to insert formula cell values into it. The same sheet instance will also hold a formula value store, and pass its reference to the formula interface class. The formula interface class must implement the following methods: • set_position() • set_formula() • set_shared_formula_index() • set_result_string() • set_result_value() • set_result_empty() • set_result_bool() • commit() Depending on the type of a formula cell, and depending on the format of the document, some methods may not be called. The set_position() method always gets called regardless of the formula cell type, to specify the position of the formula cell. The set_formula() gets called for a formula cell that does not share its formula expression with any other formula cells, or a formula cell that shares its formula expression with a group of other formuls cells and is the primary cell of that group. If it’s the primary cell of a grouped formula cells, the set_shared_formula_index() method also gets called to receive the identifier value of that group. All formula cells belonging to the same group receives the same identifier value via set_shared_formula_index(), but only the primary cell of a group receives the formula expression string via set_formula(). The rest of the methods - set_result_string(), set_result_value(), set_result_empty() and set_result_bool() - are called to deliver the cached formula cell value when applicable.

24 Chapter 1. Overview Orcus Documentation, Release 0.16

The commit() method gets called at the very end to let the implementation commit the formula cell data to the backend document store. Without further ado, here is the formula interface implementation that we will use: class my_formula : public iface::import_formula { sheet_t m_sheet_index; cell_grid& m_cells; std::vector& m_formula_store;

row_t m_row; col_t m_col; formula m_formula;

public: my_formula(sheet_t sheet, cell_grid& cells, std::vector& formulas): m_sheet_index(sheet), m_cells(cells), m_formula_store(formulas), m_row(0), m_col(0){}

virtual void set_position(row_t row, col_t col) override { m_row= row; m_col= col; }

virtual void set_formula(formula_grammar_t grammar, const char* p, size_t n) override { m_formula.expression= std::string(p, n); m_formula.grammar= grammar; }

virtual void set_shared_formula_index(size_t index) override {}

virtual void set_result_string(size_t sindex) override {} virtual void set_result_value(double value) override {} virtual void set_result_empty() override {} virtual void set_result_bool(bool value) override {}

virtual void commit() override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< m_row<<"; col:"<< m_col<

˓→<"): formula =" << m_formula.expression<<"("<< m_formula.grammar<<")"<< endl;

size_t index= m_formula_store.size(); m_cells(m_row, m_col).type= cell_value_type::formula; m_cells(m_row, m_col).index= index; m_formula_store.push_back(std::move(m_formula)); } };

1.2. Loading spreadsheet documents 25 Orcus Documentation, Release 0.16

Note that since we are loading a OpenDocument Spereadsheet file (.ods) which does not support shared formulas, we do not need to handle the set_shared_formula_index() method. Likewise, we are leaving the set_result_* methods unhandled for now. This interface class also stores references to cell_grid and std::vector instances, both of which are passed from the parent sheet instance. We also need to make a few changes to the sheet interface class to provide a formula interface and add a formula value store: class my_sheet : public iface::import_sheet { cell_grid m_cells; std::vector m_formula_store; my_formula m_formula_iface; range_size_t m_sheet_size; sheet_t m_sheet_index; const ss_type& m_string_pool; public: my_sheet(sheet_t sheet_index, const ss_type& string_pool): m_formula_iface(sheet_index, m_cells, m_formula_store), m_sheet_index(sheet_index), m_string_pool(string_pool) { m_sheet_size.rows= 1000; m_sheet_size.columns= 100; }

virtual void set_string(row_t row, col_t col, size_t sindex) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col <<"): string index ="<< sindex<<"("<< m_string_pool[sindex]<<")"<

˓→< endl;

m_cells(row, col).type= cell_value_type::string; m_cells(row, col).index= sindex; }

virtual void set_value(row_t row, col_t col, double value) override { cout<<"(sheet:"<< m_sheet_index<<"; row:"<< row<<"; col:"<< col <<"): value ="<< value<< endl;

m_cells(row, col).type= cell_value_type::numeric; m_cells(row, col).f= value; }

virtual range_size_t get_sheet_size() const override { return m_sheet_size; }

// We don't implement these methods for now. (continues on next page)

26 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) virtual void set_auto(row_t row, col_t col, const char* p, size_t n) override {} virtual void set_bool(row_t row, col_t col, bool value) override {} virtual void set_date_time(row_t row, col_t col, int year, int month, int day, int␣

˓→hour, int minute, double second) override {} virtual void set_format(row_t row, col_t col, size_t xf_index) override {} virtual void set_format( row_t row_start, col_t col_start, row_t row_end, col_t col_end, size_t xf_index)␣

˓→override {}

virtual iface::import_formula* get_formula() override { return &m_formula_iface; } };

We’ve added the get_formula() method which returns a pointer to the my_formula class instance defined above. The rest of the code is unchanged. Now let’s see what happens when loading the same sheet from the previous section: (sheet: 1; row: 0; col: 0): string index = 44 (X) (sheet: 1; row: 0; col: 1): string index = 45 (Y) (sheet: 1; row: 0; col: 2): string index = 46 (X + Y) (sheet: 1; row: 1; col: 0): value = 18 (sheet: 1; row: 1; col: 1): value = 79 (sheet: 1; row: 2; col: 0): value = 48 (sheet: 1; row: 2; col: 1): value = 55 (sheet: 1; row: 3; col: 0): value = 99 (sheet: 1; row: 3; col: 1): value = 35 (sheet: 1; row: 4; col: 0): value = 41 (sheet: 1; row: 4; col: 1): value = 69 (sheet: 1; row: 5; col: 0): value = 5 (sheet: 1; row: 5; col: 1): value = 18 (sheet: 1; row: 6; col: 0): value = 46 (sheet: 1; row: 6; col: 1): value = 69 (sheet: 1; row: 7; col: 0): value = 36 (sheet: 1; row: 7; col: 1): value = 67 (sheet: 1; row: 8; col: 0): value = 78 (sheet: 1; row: 8; col: 1): value = 2 (sheet: 1; row: 1; col: 2): formula = [.A2]+[.B2] (ods) (sheet: 1; row: 2; col: 2): formula = [.A3]+[.B3] (ods) (sheet: 1; row: 3; col: 2): formula = [.A4]+[.B4] (ods) (sheet: 1; row: 4; col: 2): formula = [.A5]+[.B5] (ods) (sheet: 1; row: 5; col: 2): formula = [.A6]+[.B6] (ods) (sheet: 1; row: 6; col: 2): formula = [.A7]+[.B7] (ods) (sheet: 1; row: 7; col: 2): formula = [.A8]+[.B8] (ods) (sheet: 1; row: 8; col: 2): formula = [.A9]+[.B9] (ods)

Looks like we are getting the formula cell values this time around. One thing to note is that the formula expression strings you see here follow the syntax rules of OpenFormula specifi- cation, which is the formula syntax referenced by the OpenDocument Spreadsheet format.

1.2. Loading spreadsheet documents 27 Orcus Documentation, Release 0.16

Implement more interfaces

This section has covered only a part of the available spreadsheet interfaces you can implement in your code. Refer to the Spreadsheet Interface section to see the complete list of interfaces.

1.3 Loading hierarchical documents

The orcus library also includes support for hierarchical document types such as JSON and YAML. The following sections delve more into the support for these types of documents.

1.3.1 JSON

The JSON part of orcus consists of a low-level parser class that handles parsing of JSON strings, and a high-level document class that stores parsed JSON structures as a node tree. There are two approaches to processing JSON strings using the orcus library. One approach is to utilize the document_tree class to load and populate the JSON structure tree via its load() method and traverse the tree through its get_document_root() method. This approach is ideal if you want a quick way to parse and access the content of a JSON document with minimal effort. Another approach is to use the low-level json_parser class directly by providing your own handler class to receive callbacks from the parser. This method requires a bit more effort on your part to provide and populate yourown data structure, but if you already have a data structure to store the content of JSON, then this approach is ideal. The document_tree class internally uses json_parser to parse JSON contents.

Populating a document tree from JSON string

The following code snippet shows an example of how to populate an instance of document_tree from a JSON string, and navigate its content tree afterward. #include #include #include

#include #include using namespace std; const char* json_string="{" " \"name\": \"John Doe\"," " \"occupation\": \"Software Engineer\"," " \"score\": [89, 67, 90]" "}"; int main() { using node= orcus::json::node;

orcus::json_config config; // Use default configuration.

(continues on next page)

28 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) orcus::json::document_tree doc; doc.load(json_string, config);

// Root is an object containing three key-value pairs. node root= doc.get_document_root();

for (const orcus::pstring& key : root.keys()) { node value= root.child(key); switch (value.type()) { case orcus::json::node_t::string: // string value cout<< key<<":"<< value.string_value()<< endl; break; case orcus::json::node_t::array: { // array value cout<< key<<":"<< endl;

for (size_t i=0;i< value.child_count();++i) { node array_element= value.child(i); cout<<"-"<< array_element.numeric_value()<< endl; } } break; default: ; } }

return EXIT_SUCCESS; }

You’ll see the following output when executing this code: name: John Doe occupation: Software Engineer score: - 89 - 67 - 90

1.3. Loading hierarchical documents 29 Orcus Documentation, Release 0.16

Using the low-level parser

The following code snippet shows how to use the low-level json_parser class by providing an own handler class and passing it as a template argument: #include #include #include #include

using namespace std;

class json_parser_handler : public orcus::json_handler { public: void object_key(const char* p, size_t len, bool transient) { cout<<"object key:"<< orcus::pstring(p, len)<< endl; }

void string(const char* p, size_t len, bool transient) { cout<<"string:"<< orcus::pstring(p, len)<< endl; }

void number(double val) { cout<<"number:"<< val<< endl; } };

int main() { const char* test_code="{ \"key1\": [1,2,3,4,5], \"key2\": 12.3}"; size_t n= strlen(test_code);

cout<<"JSON string:"<< test_code<< endl;

// Instantiate the parser with an own handler. json_parser_handler hdl; orcus::json_parser parser(test_code, n, hdl);

// Parse the string. parser.parse();

return EXIT_SUCCESS; }

The parser constructor expects the char array, its length, and the handler instance. The base handler class json_handler implements all required handler methods. By inheriting from it, you only need to implement the handler methods you need. In this example, we are only implementing the object_key(), string(), and number() methods to process object key values, string values and numeric values, respectively. Refer to the json_handler class definition for all available handler methods. Executing this code will generate the following output:

30 Chapter 1. Overview Orcus Documentation, Release 0.16

JSON string: {"key1": [1,2,3,4,5], "key2": 12.3} object key: key1 number: 1 number: 2 number: 3 number: 4 number: 5 object key: key2 number: 12.3

Building a document tree directly

You can also create and populate a JSON document tree directly without needing to parse a JSON string. This approach is ideal if you want to create a JSON tree from scratch and export it as a string. The following series of code snippets demonstrate how to exactly build JSON document trees directly and export their contents as JSON strings. The first example shows how to initialize the tree with a simple array: orcus::json::document_tree doc={ 1.0, 2.0,"string value", false, nullptr };

std::cout<< doc.dump()<< std::endl;

You can simply specify the content of the array via initialization list and assign it to the document. The dump() method then turns the content into a single string instance, which looks like the following: [ 1, 2, "string value", false, null ]

If you need to build a array of arrays, do like the following: orcus::json::document_tree doc={ { true, false, nullptr }, { 1.1, 2.2,"text"} };

std::cout<< doc.dump()<< std::endl;

This will create an array of two nested child arrays with three values each. Dumping the content of the tree as a JSON string will produce something like the following: [ [ true, false, null ], (continues on next page)

1.3. Loading hierarchical documents 31 Orcus Documentation, Release 0.16

(continued from previous page) [ 1.1, 2.2, "text" ] ]

Creating an object can be done by nesting one of more key-value pairs, each of which is surrounded by a pair of curly braces, inside another pair of curly braces. For example, the following code: orcus::json::document_tree doc={ {"key1", 1.2}, {"key2","some text"}, }; std::cout<< doc.dump()<< std::endl; produces the following output: { "key1": 1.2, "key2": "some text" } indicating that the tree consists of a single object having two key-value pairs. You may notice that this syntax is identical to the syntax for creating an array of arrays as shown above. In fact, in order for this to be an object, each of the inner sequences must have exactly two values, and its first value must be a string value. Failing that, it will be interpreted as an array of arrays. As with arrays, nesting of objects is also supported. The following code: orcus::json::document_tree doc={ {"parent1",{ {"child1", true}, {"child2", false}, {"child3", 123.4}, } }, {"parent2","not-nested"}, }; std::cout<< doc.dump()<< std::endl; creates a root object having two key-value pairs one of which contains another object having three key-value pairs, as evident in the following output generated by this code: { "parent1": { "child1": true, "child2": false, "child3": 123.4 }, (continues on next page)

32 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) "parent2": "not-nested" }

There is one caveat that you need to be aware of because of this special object creation syntax. When you have a nested array that exactly contains two values and the first value is a string value, you must explicitly declare that as anarray by using an array class instance. For instance, this code: orcus::json::document_tree doc={ {"array",{"one", 987.0}} }; is intended to be an object containing an array. However, because the supposed inner array contains exactly two values and the first value is a string value, which could be interpreted as a key-value pair for the outer object, itendsupbeing too ambiguous and a key_value_error exception gets thrown as a result. To work around this ambiguity, you need to declare the inner array to be explicit by using an array instance: using namespace orcus; json::document_tree doc={ {"array", json::array({"one", 987.0})} };

This code now correctly generates a root object containing one key-value pair whose value is an array: { "array": [ "one", 987 ] }

Similar ambiguity issue arises when you want to construct a tree consisting only of an empty root object. You may be tempted to write something like this: using namespace orcus; json::document_tree doc= {};

However, this will result in leaving the tree entirely unpopulated i.e. the tree will not even have a root node! If you continue on and try to get a root node from this tree, you’ll get a document_error thrown as a result. If you inspect the error message stored in the exception: try { auto root= doc.get_document_root(); } catch (const json::document_error& e) { std::cout<< e.what()<< std::endl; } you will get

1.3. Loading hierarchical documents 33 Orcus Documentation, Release 0.16

json::document_error: document tree is empty giving you further proof that the tree is indeed empty! The solution here is to directly assign an instance of object to the document tree, which will initialize the tree with an empty root object. The following code: using namespace orcus; json::document_tree doc= json::object(); std::cout<< doc.dump()<< std::endl; will therefore generate { }

You can also use the object class instances to indicate empty objects anythere in the tree. For instance, this code: using namespace orcus; json::document_tree doc={ json::object(), json::object(), json::object() }; is intended to create an array containing three empty objects as its elements, and that’s exactly what it does: [ { }, { }, { } ]

So far all the examples have shown how to initialize the document tree as the tree itself is being constructed. But our next example shows how to create new key-value pairs to existing objects after the document tree instance has been initialized. using namespace orcus;

// Initialize the tree with an empty object. json::document_tree doc= json::object();

// Get the root object, and assign three key-value pairs. json::node root= doc.get_document_root(); root["child1"]= 1.0; root["child2"]="string"; root["child3"]={ true, false}; // implicit array

// You can also create a key-value pair whose value is another object. root["child object"]={ (continues on next page)

34 Chapter 1. Overview Orcus Documentation, Release 0.16

(continued from previous page) {"key1", 100.0}, {"key2", 200.0} };

root["child array"]= json::array({ 1.1, 1.2, true }); // explicit array

This code first initializes the tree with an empty object, then retrieves the root empty object and assigns several key-value pairs to it. When converting the tree content to a string and inspecting it you’ll see something like the following: { "child array": [ 1.1, 1.2, true ], "child1": 1, "child3": [ true, false ], "child2": "string", "child object": { "key1": 100, "key2": 200 } }

The next example shows how to append values to an existing array after the tree has been constructed. Let’s take a look at the code: using namespace orcus;

// Initialize the tree with an empty array root. json::document_tree doc= json::array();

// Get the root array. json::node root= doc.get_document_root();

// Append values to the array. root.push_back(-1.2); root.push_back("string"); root.push_back(true); root.push_back(nullptr);

// You can append an object to the array via push_back() as well. root.push_back({{"key1", 1.1},{"key2", 1.2}});

Like the previous example, this code first initializes the tree but this time with an empty array as its root, retrieves the root array, then appends several values to it via its push_back() method. When you dump the content of this tree as a JSON string you’ll get something like this:

1.3. Loading hierarchical documents 35 Orcus Documentation, Release 0.16

[ -1.2, "string", true, null, { "key1": 1.1, "key2": 1.2 } ]

1.3.2 YAML

TBD

36 Chapter 1. Overview CHAPTER TWO

C++ API

2.1 Low-Level Parsers and Utilities

2.1.1 Utilities class orcus::pstring This string class does not have its own char array buffer; it only stores the memory position of the first charofan existing char array and its size. When using this class, it is important that the string object being referenced by an instance of this class will stay valid during its life time.

Public Functions

inline pstring()

pstring(const char *_pos)

inline pstring(const char *_pos, size_t _size)

inline pstring(const std::string &s)

inline ::std::string str() const

inline size_t size() const

inline const char &operator[](size_t idx) const

inline pstring &operator=(const pstring &r)

inline const char *get() const

inline const char *data() const

bool operator==(const pstring &r) const

37 Orcus Documentation, Release 0.16

inline bool operator!=(const pstring &r) const

bool operator<(const pstring &r) const

bool operator==(const char *_str) const

inline bool operator!=(const char *_str) const

pstring trim() const

inline bool empty() const

inline void clear()

void resize(size_t new_size)

struct hash

Public Functions

size_t operator()(const pstring &val) const class orcus::string_pool Implements string hash map.

Public Functions

string_pool(const string_pool&) = delete

string_pool &operator=(const string_pool&) = delete

string_pool()

~string_pool()

std::pair intern(const char *str) Intern a string. Parameters str – string to intern. It must be null-terminated. Returns pair whose first value is the interned string, and the second value specifies whether itis a newly created instance (true) or a reuse of an existing instance (false). std::pair intern(const char *str, size_t n) Intern a string.

38 Chapter 2. C++ API Orcus Documentation, Release 0.16

Parameters • str – string to intern. It doesn’t need to be null-terminated. • n – length of the string. Returns see above. std::pair intern(const pstring &str) Intern a string. Parameters str – string to intern. Returns see above. std::vector get_interned_strings() const Return all interned strings. Returns sequence of all interned strings. The sequence will be sorted. void dump() const

void clear()

size_t size() const

void swap(string_pool &other)

void merge(string_pool &other) Merge another string pool instance in. This will not invalidate any string references to the other pool. The other string pool instance will become empty when this call returns. Parameters other – string pool instance to merge in. class orcus::tokens

Public Functions

tokens(const char **token_names, size_t token_name_count)

bool is_valid_token(xml_token_t token) const Check if a token returned from get_token() method is valid. Returns true if valid, false otherwise. xml_token_t get_token(const pstring &name) const Get token from a specified name. Parameters name – textural token name Returns token value representing the given textural token. const char *get_token_name(xml_token_t token) const Get textural token name from a token value. Parameters token – numeric token value Returns textural token name, or empty string in case the given token is not valid.

2.1. Low-Level Parsers and Utilities 39 Orcus Documentation, Release 0.16

class orcus::cell_buffer Temporary cell buffer used to convert cell values when needed. This is used in the sax and csvparsers.

Public Functions

cell_buffer() Logical buffer size. May differ from the actual buffer size. void append(const char *p, size_t len)

void reset()

const char *get() const

size_t size() const

bool empty() const class orcus::zip_archive

Public Functions

zip_archive(zip_archive_stream *stream)

~zip_archive()

void load() Loading involves the parsing of the central directory of a zip archive (located toward the end of the stream) and building of file entry data which are stored in the central directory. void dump_file_entry(size_t index) const Dump the content of a specified file entry to stdout. Parameters index – file entry index void dump_file_entry(const char *entry_name) const Dump the content of a specified file entry to stdout. Parameters entry_name – file entry name. pstring get_file_entry_name(size_t index) const Get file entry name from its index. Parameters index – file entry index Returns file entry name size_t get_file_entry_count() const Return the number of file entries stored in this zip archive. Note that a file entry may be a directory, sothe number of files stored in the zip archive may not equal the number of file entries. Returns number of file entries.

40 Chapter 2. C++ API Orcus Documentation, Release 0.16

bool read_file_entry(const pstring &entry_name, std::vector &buf) const Retrieve data stream of specified file entry into buffer. The retrieved data stream gets uncompressed if the original stream is compressed. The method will overwrite the content of passed buffer if there is any pre-existing data in it. Parameters • entry_name – file entry name • buf – buffer to put the retrieved data stream into. Returns true if successful, false otherwise.

2.1.2 XML Types typedef size_t orcus::xml_token_t typedef const char *orcus::xmlns_id_t struct orcus::xml_name_t

Public Types

enum to_string_type Values:

enumerator use_alias enumerator use_short_name

Public Functions

xml_name_t()

xml_name_t(xmlns_id_t _ns, const pstring &_name)

xml_name_t(const xml_name_t &r)

xml_name_t &operator=(const xml_name_t &other)

bool operator==(const xml_name_t &other) const

bool operator!=(const xml_name_t &other) const

std::string to_string(const xmlns_context &cxt, to_string_type type) const

std::string to_string(const xmlns_repository &repo) const

2.1. Low-Level Parsers and Utilities 41 Orcus Documentation, Release 0.16

Public Members

xmlns_id_t ns pstring name struct orcus::xml_token_attr_t

Public Functions

xml_token_attr_t()

xml_token_attr_t(xmlns_id_t _ns, xml_token_t _name, const pstring &_value, bool _transient)

xml_token_attr_t(xmlns_id_t _ns, xml_token_t _name, const pstring &_raw_name, const pstring &_value, bool _transient)

Public Members

xmlns_id_t ns xml_token_t name pstring raw_name pstring value bool transient Whether or not the attribute value is transient. A transient value is only guaranteed to be valid until the end of the start_element call, after which its validity is not guaranteed. A non-transient value is guaranteed to be valid during the life cycle of the stream it belongs to.

2.1.3 Other Types enum orcus::length_unit_t Values:

enumerator unknown enumerator centimeter enumerator millimeter enumerator xlsx_column_digit enumerator inch enumerator point enumerator twip enumerator pixel struct orcus::date_time_t

42 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Functions

date_time_t()

date_time_t(int _year, int _month, int _day)

date_time_t(int _year, int _month, int _day, int _hour, int _minute, double _second)

date_time_t(const date_time_t &other)

~date_time_t()

date_time_t &operator=(date_time_t other)

bool operator==(const date_time_t &other) const

bool operator!=(const date_time_t &other) const

std::string to_string() const

void swap(date_time_t &other)

Public Members

int year int month int day int hour int minute double second

2.1.4 CSS Parser template class orcus::css_parser : public orcus::css::parser_base

2.1. Low-Level Parsers and Utilities 43 Orcus Documentation, Release 0.16

Public Types

typedef _Handler handler_type

Public Functions

css_parser(const char *p, size_t n, handler_type &hdl)

void parse()

Parser Handler class orcus::css_handler Empty handler for CSS parser. Sub-class from it and implement necessary methods.

Public Functions

inline void at_rule_name(const char *p, size_t n)

inline void simple_selector_type(const char *p, size_t n)

inline void simple_selector_class(const char *p, size_t n)

inline void simple_selector_pseudo_element(orcus::css::pseudo_element_t pe)

inline void simple_selector_pseudo_class(orcus::css::pseudo_class_t pc)

inline void simple_selector_id(const char *p, size_t n)

inline void end_simple_selector()

inline void end_selector()

inline void combinator(orcus::css::combinator_t combinator)

inline void property_name(const char *p, size_t n) Called at each property name. Parameters • p – pointer to the char-array containing the property name string. • n – length of the property name string. inline void value(const char *p, size_t n) Called at each ordinary property value string.

44 Chapter 2. C++ API Orcus Documentation, Release 0.16

Parameters • p – pointer to the char-array containing the value string. • n – length of the value string. inline void rgb(uint8_t red, uint8_t green, uint8_t blue) Called at each RGB color value of a property. Parameters • red – value of red (0-255) • green – value of green (0-255) • blue – value of blue (0-255) inline void rgba(uint8_t red, uint8_t green, uint8_t blue, double alpha) Called at each RGB color value of a property with alpha transparency value. Parameters • red – value of red (0-255) • green – value of green (0-255) • blue – value of blue (0-255) • alpha – alpha transparency value inline void hsl(uint8_t hue, uint8_t sat, uint8_t light) Called at each HSL color value of a property. Parameters • hue – hue • sat – saturation • light – lightness inline void hsla(uint8_t hue, uint8_t sat, uint8_t light, double alpha) Called at each HSL color value of a property with alpha transparency value. Parameters • hue – hue • sat – saturation • light – lightness • alpha – alpha value inline void url(const char *p, size_t n) Called at each URL value of a property. Parameters • p – pointer to the char-array containing the URL value string. • n – length of the URL value string. inline void begin_parse() Called when the parsing begins. inline void end_parse() Called when the parsing ends.

2.1. Low-Level Parsers and Utilities 45 Orcus Documentation, Release 0.16

inline void begin_block() Called at the beginning of each block. An opening brace ‘{’ marks the beginning of a block. inline void end_block() Called at the end of each block. A closing brace ‘}’ marks the end of a block. inline void begin_property() Called at the beginning of each property. inline void end_property() Called at the end of each property.

CSS Types

enum orcus::css::combinator_t Values:

enumerator descendant ‘E F’ where F is a descendant of E.

enumerator direct_child ‘E > F’ where F is a direct child of E.

enumerator next_sibling ‘E + F’ where F is a direct sibling of E where E precedes F.

enum orcus::css::property_function_t List of functions used as property values. Values:

enumerator unknown enumerator hsl enumerator hsla enumerator rgb enumerator rgba enumerator url enum orcus::css::property_value_t Values:

enumerator none enumerator string enumerator hsl enumerator hsla enumerator rgb enumerator rgba enumerator url

46 Chapter 2. C++ API Orcus Documentation, Release 0.16

using orcus::css::pseudo_element_t = uint16_t using orcus::css::pseudo_class_t = uint64_t

2.1.5 CSV Parser template class orcus::csv_parser : public orcus::csv::parser_base

Public Types

typedef _Handler handler_type

Public Functions

csv_parser(const char *p, size_t n, handler_type &hdl, const csv::parser_config &config)

void parse() struct orcus::csv::parser_config Run-time configuration object for orcus::csv_parser.

Public Functions

parser_config()

Public Members

std::string delimiters One or more characters that serve as cell boundaries.

char text_qualifier A single character used as a text quote value.

bool trim_cell_value When true, the value of each cell gets trimmed i.e. any leading or trailing white spaces will get ignored.

2.1. Low-Level Parsers and Utilities 47 Orcus Documentation, Release 0.16

Parser Handler

class orcus::csv_handler

Public Functions

inline void begin_parse() Called when the parser starts parsing a stream. inline void end_parse() Called when the parser finishes parsing a stream. inline void begin_row() Called at the beginning of every row. inline void end_row() Called at the end of every row. inline void cell(const char *p, size_t n, bool transient) Called after every cell is parsed. Parameters • p – pointer to the first character of a cell content. • n – number of characters the cell content consists of. • transient – when true, the text content has been converted and is stored in a temporary buffer. In such case, there is no guarantee that the text content remain available afterthe end of the call. When this value is false, the text content is guaranteed to be valid so long as the original CSV stream content is valid.

2.1.6 JSON Parser template class orcus::json_parser : public orcus::json::parser_base Low-level JSON parser. The caller must provide a handler class to receive callbacks.

Public Types

typedef _Handler handler_type

Public Functions

json_parser(const char *p, size_t n, handler_type &hdl) Constructor. Parameters • p – pointer to a string stream containing JSON string. • n – size of the stream. • hdl – handler class instance.

48 Chapter 2. C++ API Orcus Documentation, Release 0.16

void parse() Call this method to start parsing.

Parser Handler class orcus::json_handler

Public Functions

inline void begin_parse() Called when the parsing begins. inline void end_parse() Called when the parsing ends. inline void begin_array() Called when the opening brace of an array is encountered. inline void end_array() Called when the closing brace of an array is encountered. inline void begin_object() Called when the opening curly brace of an object is encountered. inline void object_key(const char *p, size_t len, bool transient) Called when a key value string of an object is encountered. Parameters • p – pointer to the first character of the key value string. • len – length of the key value string. • transient – true if the string value is stored in a temporary buffer which is not guaranteed to hold the string value after the end of this callback. When false, the pointer points to somewhere in the JSON stream being parsed. inline void end_object() Called when the closing curly brace of an object is encountered. inline void boolean_true() Called when a boolean ‘true’ keyword is encountered. inline void boolean_false() Called when a boolean ‘false’ keyword is encountered. inline void null() Called when a ‘null’ keyword is encountered. inline void string(const char *p, size_t len, bool transient) Called when a string value is encountered. Parameters • p – pointer to the first character of the string value. • len – length of the string value. • transient – true if the string value is stored in a temporary buffer which is not guaranteed to hold the string value after the end of this callback. When false, the pointer points to somewhere in the JSON stream being parsed.

2.1. Low-Level Parsers and Utilities 49 Orcus Documentation, Release 0.16

inline void number(double val) Called when a numeric value is encountered. Parameters val – numeric value.

2.1.7 XML Parsers template class orcus::sax_parser : public orcus::sax::parser_base Template-based sax parser that doesn’t use function pointer for callbacks for better performance, especially on large XML streams.

Public Types

typedef _Handler handler_type typedef _Config config_type

Public Functions

sax_parser(const char *content, const size_t size, handler_type &handler)

sax_parser(const char *content, const size_t size, bool transient_stream, handler_type &handler)

~sax_parser()

void parse()

template class orcus::sax_ns_parser SAX based XML parser with proper namespace handling.

Public Types

typedef _Handler handler_type

Public Functions

sax_ns_parser(const char *content, const size_t size, xmlns_context &ns_cxt, handler_type &handler)

sax_ns_parser(const char *content, const size_t size, bool transient_stream, xmlns_context &ns_cxt, handler_type &handler)

~sax_ns_parser()

50 Chapter 2. C++ API Orcus Documentation, Release 0.16

void parse() template class orcus::sax_token_parser XML parser that tokenizes element and attribute names while parsing.

Public Types

typedef _Handler handler_type

Public Functions

sax_token_parser(const char *content, const size_t size, const tokens &_tokens, xmlns_context &ns_cxt, handler_type &handler)

sax_token_parser(const char *content, const size_t size, bool transient_stream, const tokens &_tokens, xmlns_context &ns_cxt, handler_type &handler)

~sax_token_parser()

void parse()

Parser Handlers class orcus::sax_handler

Public Functions

inline void doctype(const orcus::sax::doctype_declaration ¶m) Called when a doctype declaration is encountered. Parameters param – struct containing doctype declaration data. inline void start_declaration(const orcus::pstring &decl) Called when ) of a is encountered. Parameters decl – name of the identifier. inline void start_element(const orcus::sax::parser_element &elem) Called at the start of each element. Parameters elem – information of the element being parsed.

2.1. Low-Level Parsers and Utilities 51 Orcus Documentation, Release 0.16

inline void end_element(const orcus::sax::parser_element &elem) Called at the end of each element. Parameters elem – information of the element being parsed. inline void characters(const orcus::pstring &val, bool transient) Called when a segment of a text content is parsed. Each text content is a direct child of an element, which may have multiple child contents when the element also has a child element that are direct sibling to the text contents or the text contents are splitted by a comment. Parameters • val – value of the text content. • transient – when true, the text content has been converted and is stored in a temporary buffer due to presence of one or more encoded characters, in whichcase the passed text value needs to be either immediately converted to a non-text value or be interned within the scope of the callback. inline void attribute(const orcus::sax::parser_attribute &attr) Called upon parsing of an attribute of an element. Note that when the attribute’s transient flag is set, the attribute value is stored in a temporary buffer due to presence of one or more encoded characters, and must be processed within the scope of the callback. Parameters attr – struct containing attribute information.

class orcus::sax_ns_handler

Public Functions

inline void doctype(const orcus::sax::doctype_declaration&)

inline void start_declaration(const orcus::pstring&)

inline void end_declaration(const orcus::pstring&)

inline void start_element(const orcus::sax_ns_parser_element&)

inline void end_element(const orcus::sax_ns_parser_element&)

inline void characters(const orcus::pstring&, bool)

inline void attribute(const orcus::pstring&, const orcus::pstring&)

inline void attribute(const orcus::sax_ns_parser_attribute&) class orcus::sax_token_handler

52 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Functions

inline void declaration(const orcus::xml_declaration_t &decl) Called immediately after the entire XML declaration has been parsed. Parameters decl – struct containing the attributes of the XML declaration. inline void start_element(const orcus::xml_token_element_t &elem) Called at the start of each element. Parameters elem – struct containing the element’s information as well as all the attributes that belong to the element. inline void end_element(const orcus::xml_token_element_t &elem) Called at the end of each element. Parameters elem – struct containing the element’s information as well as all the attributes that belong to the element. inline void characters(const orcus::pstring &val, bool transient) Called when a segment of a text content is parsed. Each text content is a direct child of an element, which may have multiple child contents when the element also has a child element that are direct sibling to the text contents or the text contents are splitted by a comment. Parameters • val – value of the text content. • transient – when true, the text content has been converted and is stored in a temporary buffer due to presence of one or more encoded characters, in whichcase the passed text value needs to be either immediately converted to a non-text value or be interned within the scope of the callback.

Namespace class orcus::xmlns_repository Central XML namespace repository that stores all namespaces that are used in the current session.

Public Functions

xmlns_repository()

~xmlns_repository()

void add_predefined_values(const xmlns_id_t *predefined_ns) Add a set of predefined namespace values to the repository. Parameters predefined_ns – predefined set of namespace values. This is a null-terminated array of xmlns_id_t. This xmlns_repository instance will assume that the instances of these xmlns_id_t values will be available throughout its life cycle; caller needs to ensure that they won’t get deleted before the corresponding xmlns_repository instance is deleted. xmlns_context create_context()

xmlns_id_t get_identifier(size_t index) const Get XML namespace identifier from its numerical index.

2.1. Low-Level Parsers and Utilities 53 Orcus Documentation, Release 0.16

Parameters index – numeric index of namespace. Returns valid namespace identifier, or XMLNS_UNKNOWN_ID if not found. std::string get_short_name(xmlns_id_t ns_id) const

std::string get_short_name(size_t index) const class orcus::xmlns_context XML namespace context. A new context should be used for each xml stream since the namespace keys themselves are not interned. Don’t hold an instance of this class any longer than the life cycle of the xml stream it is used in. An empty key value is associated with a default namespace.

Public Functions

xmlns_context()

xmlns_context(xmlns_context&&)

xmlns_context(const xmlns_context &r)

~xmlns_context()

xmlns_context &operator=(const xmlns_context &r)

xmlns_context &operator=(xmlns_context &&r)

xmlns_id_t push(const pstring &key, const pstring &uri)

void pop(const pstring &key)

xmlns_id_t get(const pstring &key) const Get the currnet namespace identifier for a specified namespace alias. Parameters key – namespace alias to get the current namespace identifier for. Returns current namespace identifier associated with the alias. size_t get_index(xmlns_id_t ns_id) const Get a unique index value associated with a specified identifier. An index value is guaranteed to beunique regardless of contexts. Parameters ns_id – a namespace identifier to obtain index for. Returns index value associated with the identifier. std::string get_short_name(xmlns_id_t ns_id) const Get a ‘short’ name associated with a specified identifier. A short name is a string value conveniently short enough for display purposes, but still guaranteed to be unique to the identifier it is associated with. Note that the xmlns_repository class has method of the same name, and that method works identically to this method.

54 Chapter 2. C++ API Orcus Documentation, Release 0.16

Parameters ns_id – a namespace identifier to obtain short name for. Returns short name for the specified identifier. pstring get_alias(xmlns_id_t ns_id) const Get an alias currently associated with a given namespace identifier. Parameters ns_id – namespace identifier. Returns alias name currently associted with the given namespace identifier, or an empty string if the given namespace is currently not associated with any aliases. std::vector get_all_namespaces() const

void dump(std::ostream &os) const

void swap(xmlns_context &other) noexcept

2.1.8 YAML Parser template class orcus::yaml_parser : public orcus::::parser_base

Public Types

typedef _Handler handler_type

Public Functions

yaml_parser(const char *p, size_t n, handler_type &hdl)

void parse()

Parser Handler

class orcus::yaml_handler

2.1. Low-Level Parsers and Utilities 55 Orcus Documentation, Release 0.16

Public Functions

inline void begin_parse() Called when the parser starts parsing a content. inline void end_parse() Called when the parser finishes parsing an entire content. inline void begin_document() Called when a new document is encountered. inline void end_document() Called when the parser has finished parsing a document. inline void begin_sequence() Called when a sequence begins. inline void end_sequence() Called when a sequence ends. inline void begin_map() Called when a map begins. inline void begin_map_key() Called when the parser starts parsing a map key. inline void end_map_key() Called when the parser finishes parsing a map key. inline void end_map() Called when the parser finishes parsing an entire map. inline void string(const char *p, size_t n) Called when a string value is encountered. Parameters • p – pointer to the first character of the string value. • n – length of the string value. inline void number(double val) Called when a numeric value is encountered. Parameters val – numeric value. inline void boolean_true() Called when a boolean ‘true’ keyword is encountered. inline void boolean_false() Called when a boolean ‘false’ keyword is encountered. inline void null() Called when a ‘null’ keyword is encountered.

56 Chapter 2. C++ API Orcus Documentation, Release 0.16

2.1.9 XML Writer class orcus::xml_writer This class lets you produce XML contents from scratch. It writes its content to any object supporting the std::ostream interface.

Public Functions

xml_writer(const xml_writer&) = delete

xml_writer &operator=(const xml_writer&) = delete

xml_writer(xmlns_repository &ns_repo, std::ostream &os)

xml_writer(xml_writer &&other)

xml_writer &operator=(xml_writer &&other)

~xml_writer() Destructor. Any remaining element(s) on the stack will get popped when the destructor is called. scope push_element_scope(const xml_name_t &name) Push a new element to the stack, and write an opening element to the output stream. It differs from the push_element method in that the new element will be automatically popped when the returned object goes out of scope. Parameters name – name of the new element. Returns scope object which automatically pops the element when it goes out of scope. void push_element(const xml_name_t &name) Push a new element to the stack, and write an opening element to the output stream. Parameters name – name of the element. xmlns_id_t add_namespace(const pstring &alias, const pstring &value) Add a namespace definition for the next element to be pushed. Parameters • alias – alias for the namespace. • value – value of the namespace definition. Returns ID for the namespace being added. void add_attribute(const xml_name_t &name, const pstring &value) Add a new attribute for the next element to be pushed. Parameters • name – name of the attribute to be added. • value – value of the attribute to be added. void add_content(const pstring &content) Add a content to the current element on the stack. The content will be properly encoded.

2.1. Low-Level Parsers and Utilities 57 Orcus Documentation, Release 0.16

Parameters content – content to be added to the current element. xml_name_t pop_element() Pop the current element from the stack, and write a closing element to the output stream. Returns the name of the element being popped.

class scope

Public Functions

scope(const scope&) = delete

scope(scope &&other)

~scope()

scope &operator=(scope &&other)

2.2 Types and Interfaces

2.2.1 Global Interface class orcus::iface::import_filter Subclassed by orcus::orcus_csv, orcus::orcus_gnumeric, orcus::orcus_ods, orcus::orcus_xls_xml, or- cus::orcus_xlsx

Public Functions

import_filter(format_t input)

virtual ~import_filter()

virtual void read_file(const std::string &filepath) = 0 expects a system path to a local file virtual void read_stream(const char *content, size_t len) = 0 expects the whole content of the file virtual const char *get_name() const = 0

void set_config(const orcus::config &v)

const orcus::config &get_config() const

58 Chapter 2. C++ API Orcus Documentation, Release 0.16

class orcus::iface::document_dumper Subclassed by orcus::spreadsheet::document

Public Functions

virtual ~document_dumper()

virtual void dump(dump_format_t format, const std::string &output) const = 0

virtual void dump_check(std::ostream &os) const = 0

2.2.2 Spreadsheet Interface import_array_formula class orcus::spreadsheet::iface::import_array_formula

Public Functions

virtual ~import_array_formula()

virtual void set_range(const range_t &range) = 0

virtual void set_formula(formula_grammar_t grammar, const char *p, size_t n) = 0

virtual void set_result_string(row_t row, col_t col, const char *p, size_t n) = 0

virtual void set_result_value(row_t row, col_t col, double value) = 0

virtual void set_result_bool(row_t row, col_t col, bool value) = 0

virtual void set_result_empty(row_t row, col_t col) = 0

virtual void commit() = 0

2.2. Types and Interfaces 59 Orcus Documentation, Release 0.16 import_auto_filter class orcus::spreadsheet::iface::import_auto_filter

Public Functions

virtual ~import_auto_filter() = 0

virtual void set_range(const range_t &range) = 0 Specify the range where the auto filter is applied. Parameters range – structure containing the top-left and bottom-right positions of the auto filter range. virtual void set_column(col_t col) = 0 Specify the column position of a filter. The position is relative to the first column in the auto filterrange. Parameters col – 0-based column position of a filter relative to the first column. virtual void append_column_match_value(const char *p, size_t n) = 0 Add a match value to the current column filter. Parameters • p – pointer to the first character of match value. • n – length of match value. virtual void commit_column() = 0 Commit current column filter to the current auto filter. virtual void commit() = 0 Commit current auto filter to the model. import_conditional_format class orcus::spreadsheet::iface::import_conditional_format This is an optional interface to import conditional formatting. A conditional format consists of: • a range • several entries Each entry consists of: • a type • a few properties depending on the type (optional) • zero or more conditions depending on the type Each condition consists of: • a formula/value/string • a color (optional)

60 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Functions

virtual ~import_conditional_format() = 0

virtual void set_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Sets the color of the current condition. only valid for type == databar or type == colorscale. virtual void set_formula(const char *p, size_t n) = 0 Sets the formula, value or string of the current condition. virtual void set_condition_type(condition_type_t type) = 0 Sets the type for the formula, value or string of the current condition. Only valid for type = iconset, databar or colorscale. virtual void set_date(condition_date_t date) = 0 Only valid for type = date. virtual void commit_condition() = 0 commits the current condition to the current entry. virtual void set_icon_name(const char *p, size_t n) = 0 Name of the icons to use in the current entry. only valid for type = iconset virtual void set_databar_gradient(bool gradient) = 0 Use a gradient for the current entry. only valid for type == databar virtual void set_databar_axis(databar_axis_t axis) = 0 Position of the 0 axis in the current entry. only valid for type == databar. virtual void set_databar_color_positive(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Databar color for positive values. only valid for type == databar. virtual void set_databar_color_negative(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Databar color for negative values. only valid for type == databar. virtual void set_min_databar_length(double length) = 0 Sets the minimum length for a databar. only valid for type == databar. virtual void set_max_databar_length(double length) = 0 Sets the maximum length for a databar. only valid for type == databar. virtual void set_show_value(bool show) = 0 Don’t show the value in the cell. only valid for type = databar, iconset, colorscale. virtual void set_iconset_reverse(bool reverse) = 0 Use the icons in reverse order. only valid for type == iconset. virtual void set_xf_id(size_t xf) = 0 TODO: In OOXML the style is stored as dxf and in ODF as named style. virtual void set_operator(condition_operator_t condition_type) = 0 Sets the current operation used for the current entry. only valid for type == condition virtual void set_type(conditional_format_t type) = 0

virtual void commit_entry() = 0

2.2. Types and Interfaces 61 Orcus Documentation, Release 0.16

virtual void set_range(const char *p, size_t n) = 0

virtual void set_range(row_t row_start, col_t col_start, row_t row_end, col_t col_end) = 0

virtual void commit_format() = 0

import_data_table class orcus::spreadsheet::iface::import_data_table Interface for importing data tables.

Public Functions

virtual ~import_data_table() = 0

virtual void set_type(data_table_type_t type) = 0

virtual void set_range(const range_t &range) = 0

virtual void set_first_reference(const char *p_ref, size_t n_ref, bool deleted) = 0

virtual void set_second_reference(const char *p_ref, size_t n_ref, bool deleted) = 0

virtual void commit() = 0

import_factory

class orcus::spreadsheet::iface::import_factory This interface provides the filters a means to instantiate concrete classes that implement the above interfaces. The client code never has to manually delete objects returned by its methods; the implementor of this interface must manage the life cycles of objects it returns. The implementor of this interface normally wraps the document instance inside it and have the document instance manage the life cycles of various objects it creates. Subclassed by orcus::spreadsheet::import_factory

62 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Functions

virtual ~import_factory() = 0

virtual import_global_settings *get_global_settings()

virtual import_shared_strings *get_shared_strings()

Returns pointer to the shared strings instance. It may return NULL if the client app doesn’t support shared strings. virtual import_named_expression *get_named_expression()

virtual import_styles *get_styles()

Returns pointer to the styles instance. It may return NULL if the client app doesn’t support styles. virtual import_reference_resolver *get_reference_resolver(formula_ref_context_t cxt)

virtual import_pivot_cache_definition *create_pivot_cache_definition(pivot_cache_id_t cache_id) Create an interface for pivot cache definition import for a specified cache ID. In case a pivot cachealrady exists for the passed ID, the client app should overwrite the existing cache with a brand-new cache instance. Parameters cache_id – numeric ID associated with the pivot cache. Returns pointer to the pivot cache interface instance. If may return NULL if the client app doesn’t support pivot tables. virtual import_pivot_cache_records *create_pivot_cache_records(pivot_cache_id_t cache_id) Create an interface for pivot cache records import for a specified cache ID. Parameters cache_id – numeric ID associated with the pivot cache. Returns pointer to the pivot cache records interface instance. If may return nullptr if the client app doesn’t support pivot tables. virtual import_sheet *append_sheet(sheet_t sheet_index, const char *sheet_name, size_t sheet_name_length) = 0 Append a sheet with specified sheet position index and name. Parameters • sheet_index – position index of the sheet to be appended. It is 0-based i.e. the first sheet to be appended will have an index value of 0. • sheet_name – pointer to the first character in the buffer where the sheet name is stored. • sheet_name_length – length of the sheet name. Returns pointer to the sheet instance. It may return nullptr if the client app fails to append a new sheet. virtual import_sheet *get_sheet(const char *sheet_name, size_t sheet_name_length) = 0

Returns pointer to the sheet instance whose name matches the name passed to this method. It returns nullptr if no sheet instance exists by the specified name.

2.2. Types and Interfaces 63 Orcus Documentation, Release 0.16

virtual import_sheet *get_sheet(sheet_t sheet_index) = 0 Retrieve sheet instance by specified numerical sheet index. Parameters sheet_index – sheet index Returns pointer to the sheet instance, or nullptr if no sheet instance exists at specified sheet index position. virtual void finalize() = 0 This method is called at the end of import, to give the implementor a chance to perform post-processing if necessary. import_formula class orcus::spreadsheet::iface::import_formula

Public Functions

virtual ~import_formula()

virtual void set_position(row_t row, col_t col) = 0 Set the position of the cell. Parameters • row – row position. • col – column position. virtual void set_formula(formula_grammar_t grammar, const char *p, size_t n) = 0 Set formula string to the specified cell. Parameters • grammar – grammar to use to compile the formula string into tokens. • p – pointer to the buffer where the formula string is stored. • n – size of the buffer where the formula string is stored. virtual void set_shared_formula_index(size_t index) = 0 Register the formula as a shared string, to be shared with other cells. Parameters index – shared string index to register the formula with. virtual void set_result_string(const char *p, size_t n) = 0 Set cached result of string type. Parameters • p – pointer to the buffer where the string result is stored. • n – size of the buffer where the string result is stored. virtual void set_result_value(double value) = 0 Set cached result of numeric type. Parameters value – numeric value to set as a cached result. virtual void set_result_bool(bool value) = 0 Set cached result of boolean type.

64 Chapter 2. C++ API Orcus Documentation, Release 0.16

Parameters value – boolean value to set as a cached result. virtual void set_result_empty() = 0 Set empty value as a cached result. virtual void commit() = 0 Commit all the formula data to the specified cell. import_global_settings class orcus::spreadsheet::iface::import_global_settings

Public Functions

virtual ~import_global_settings() = 0

virtual void set_origin_date(int year, int month, int day) = 0 Set the date that is to be represented by a value of 0. All date values will be internally represented relative to this date afterward. Parameters • year – 1-based value representing year • month – 1-based value representing month, varying from 1 through 12. • day – 1-based value representing day, varying from 1 through 31. virtual void set_default_formula_grammar(formula_grammar_t grammar) = 0 Set formula grammar to be used globally when parsing formulas if the grammar is not specified. This grammar will also be used when parsing range strings associated with shared formula ranges, array formula ranges, autofilter ranges etc. Parameters grammar – default formula grammar virtual formula_grammar_t get_default_formula_grammar() const = 0 Get current default formula grammar. Returns current default formula grammar. virtual void set_character_set(character_set_t charset) = 0 Set the character set to be used when parsing string values. Parameters charset – character set to apply when parsing string values. import_named_expression class orcus::spreadsheet::iface::import_named_expression Interface for importing named expressions or ranges. Note that this interface has two different methods for defining named expressions - set_named_expression() and set_named_range(). The set_named_expression() method is generally used to pass named expression strings. The set_named_range() method is used only when the format uses a different syntax to express a named range. A named range is aspecial case of named expression where the expression consists of one range token.

2.2. Types and Interfaces 65 Orcus Documentation, Release 0.16

Public Functions

virtual ~import_named_expression()

virtual void set_base_position(const src_address_t &pos) = 0 Specify an optional base position from which to evaluate a named expression. If not specified, the imple- mentor should use the top-left cell position on the first sheet as its implied base position. Parameters pos – cell position to be used as the base. virtual void set_named_expression(const char *p_name, size_t n_name, const char *p_exp, size_t n_exp) = 0 Define a new named expression or overwrite an existing one. Parameters • p_name – pointer to the buffer that stores the name of the expression to be defined. • n_name – size of the buffer that stores the name of the expression to be defined. • p_exp – pointer to the buffer that stores the expression to be associated with the name. • n_exp – size of the buffer that stores the expression to be associated with the name. virtual void set_named_range(const char *p_name, size_t n_name, const char *p_range, size_t n_range) = 0 Define a new named range or overwrite an existin gone. Note that you can only define one namedrangeor expression per single commit. Parameters • p_name – pointer to the buffer that stores the name of the expression to be defined. • n_name – size of the buffer that stores the name of the expression to be defined. • p_range – pointer to the buffer that stores the range to be associated with the name. • n_range – size of the buffer that stores the range to be associated with the name. virtual void commit() = 0

import_pivot_cache_definition class orcus::spreadsheet::iface::import_pivot_cache_definition Interface for importing pivot cache definition.

Public Functions

virtual ~import_pivot_cache_definition()

virtual void set_worksheet_source(const char *ref, size_t n_ref, const char *sheet_name, size_t n_sheet_name) = 0 Specify that the source data of this pivot cache is located on a local worksheet. Parameters • ref – pointer to the char array that contains the range string specifying the source range. • n_ref – size of the aforementioned char array.

66 Chapter 2. C++ API Orcus Documentation, Release 0.16

• sheet_name – pointer to the char array that contains the name of the worksheet where the source data is located. • n_sheet_name – size of the aforementioned char array. virtual void set_worksheet_source(const char *table_name, size_t n_table_name) = 0 Specify that the source data of this pivot cache is associated with a table. Parameters • table_name – pointer to the char array that contains the name of the table. • n_table_name – size of the aforementioned char array. virtual void set_field_count(size_t n) = 0 Set the total number of fields present in this pivot cache. Parameters n – total number of fields in this pivot cache. virtual void set_field_name(const char *p, size_t n) = 0 Set the name of the field in the current field buffer. Parameters • p – pointer to the char array that contains the field name. • n – size of the aforementioned char array. virtual void set_field_min_value(double v) = 0 Set the lowest value of the field in the current field buffer. Parameters v – lowest value of the field. virtual void set_field_max_value(double v) = 0 Set the highest value of the field in the current field buffer. Parameters v – highest value of the field. virtual void set_field_min_date(const date_time_t &dt) = 0 Set the lowest date value of the field in the current field buffer. Parameters dt – lowest date value of the field. virtual void set_field_max_date(const date_time_t &dt) = 0 Set the highest date value of the field in the current field buffer. Parameters dt – highest date value of the field. virtual import_pivot_cache_field_group *create_field_group(size_t base_index) = 0 Mark the current field as a group field. This method gets called first to signify that the current field is a group field. Parameters base_index – 0-based index of the field this field is the parent group of. Returns interface for importing group field data. virtual void commit_field() = 0 Commit the field in the current field buffer to the pivot cache model. virtual void set_field_item_string(const char *p, size_t n) = 0 Set a string value to the current field item buffer. Parameters • p – pointer to the char array that contains the string value. • n – size of the aforementioned char array.

2.2. Types and Interfaces 67 Orcus Documentation, Release 0.16

virtual void set_field_item_numeric(double v) = 0 Set a numeric value to the current field item buffer. Parameters v – numeric value. virtual void set_field_item_date_time(const date_time_t &dt) = 0 Set a date-time value to the current field item buffer. Parameters dt – date-time value. virtual void set_field_item_error(error_value_t ev) = 0 Set an error value to the current field item buffer, Parameters ev – error value. virtual void commit_field_item() = 0 Commit the field item in current field item buffer to the current fieldmodel. virtual void commit() = 0 Commit the current pivot cache model to the document model. import_pivot_cache_records class orcus::spreadsheet::iface::import_pivot_cache_records Interface for importing pivot cache records.

Public Functions

virtual ~import_pivot_cache_records()

virtual void set_record_count(size_t n) = 0

virtual void append_record_value_numeric(double v) = 0

virtual void append_record_value_character(const char *p, size_t n) = 0

virtual void append_record_value_shared_item(size_t index) = 0

virtual void commit_record() = 0 Commit the record in the current buffer, and clears the buffer. virtual void commit() = 0

68 Chapter 2. C++ API Orcus Documentation, Release 0.16 import_reference_resolver class orcus::spreadsheet::iface::import_reference_resolver

Public Functions

virtual ~import_reference_resolver()

virtual src_address_t resolve_address(const char *p, size_t n) = 0 Resolve a textural representation of a single cell address. Parameters • p – pointer to the first character of the single cell address string. • n – size of the single cell address string. Throws orcus::invalid_arg_error – the string is not a valid single cell addreess. Returns structure containing the column and row positions of the address. virtual src_range_t resolve_range(const char *p, size_t n) = 0 Resolve a textural representation of a range address. Note that a string representing a valid single cell address should be considered a valid range address. Parameters • p – pointer to the first character of the range address string. • n – size of the range address string. Throws invalid_arg_error – the string is not a valid range addreess. Returns structure containing the start and end positions of the range address. import_shared_strings class orcus::spreadsheet::iface::import_shared_strings Interface class designed to be derived by the implementor. Subclassed by orcus::spreadsheet::import_shared_strings

Public Functions

virtual ~import_shared_strings() = 0

virtual size_t append(const char *s, size_t n) = 0 Append new string to the string list. Order of insertion is important since that determines the numerical ID values of inserted strings. Note that this method assumes that the caller knows the string being appended is not yet in the pool. Parameters • s – pointer to the first character of the string array. The string array doesn’t necessary have to be null-terminated. • n – length of the string.

2.2. Types and Interfaces 69 Orcus Documentation, Release 0.16

Returns ID of the string just inserted. virtual size_t add(const char *s, size_t n) = 0 Similar to the append method, it adds new string to the string pool; however, this method checks if the string being added is already in the pool before each insertion, to avoid duplicated strings. Parameters • s – pointer to the first character of the string array. The string array doesn’t necessary have to be null-terminated. • n – length of the string. Returns ID of the string just inserted. virtual void set_segment_font(size_t font_index) = 0 Set the index of a font to apply to the current format attributes. Parameters font_index – positive integer representing the font to use. virtual void set_segment_bold(bool b) = 0 Set whether or not to make the font bold to the current format attributes. Parameters b – true if it’s bold, false otherwise. virtual void set_segment_italic(bool b) = 0 Set whether or not to set the font italic font to the current format attributes. Parameters b – true if it’s italic, false otherwise. virtual void set_segment_font_name(const char *s, size_t n) = 0 Set the name of a font to the current format attributes. Parameters • s – pointer to the first character of a char array that stores the font name. • n – size of the char array that stores the font name. virtual void set_segment_font_size(double point) = 0 Set a font size to the current format attributes. Parameters point – font size in points. virtual void set_segment_font_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Set the color of a font in ARGB to the current format attributes. Parameters • alpha – alpha component value (0-255). • red – red component value (0-255). • green – green component value (0-255). • blue – blue component value (0-255). virtual void append_segment(const char *s, size_t n) = 0 Append a string segment with the current format attributes to the formatted string buffer. Parameters • s – pointer to the first character of the string array. The string array doesn’t necessary have to be null-terminated. • n – length of the string.

70 Chapter 2. C++ API Orcus Documentation, Release 0.16

virtual size_t commit_segments() = 0 Store the formatted string in the current buffer to the shared strings store. The implementation may choose to unconditionally append the string to the store, or choose to look for an existing indentical formatted string to reuse and discard the new one if one exists. Returns ID of the string just inserted, or the ID of an existing string with identical formatting attributes. import_sheet class orcus::spreadsheet::iface::import_sheet Interface for sheet.

Public Functions

virtual ~import_sheet() = 0

virtual import_sheet_view *get_sheet_view()

virtual import_sheet_properties *get_sheet_properties()

virtual import_data_table *get_data_table() Get an interface for importing data tables. Note that the implementer may decide not to support this feature in which case this method returns NULL. The implementer is responsible for managing the life cycle of the returned interface object. The implementor should also initialize the internal state of the temporary data table object when this method is called. Returns pointer to the data table interface object. virtual import_auto_filter *get_auto_filter() Get an interface for importing auto filter ranges. The implementor should also initialize the internal state of the temporary auto filter object when this method is called. Returns pointer to the auto filter interface object. virtual import_table *get_table() Get an interface for importing tables. The implementer is responsible for managing the life cycle of the returned interface object. The implementor should also initialize the internal state of the temporary table object when this method is called. Returns pointer to the table interface object, or NULL if the implementer doesn’t support im- porting of tables. virtual import_conditional_format *get_conditional_format() get an interface for importing conditional formats. The implementer is responsible for managing the life cycle of the returned interface object. Returns pointer to the conditional format interface object, or NULL if the implementer doesn’t support importing conditional formats.

2.2. Types and Interfaces 71 Orcus Documentation, Release 0.16

virtual import_named_expression *get_named_expression()

virtual import_array_formula *get_array_formula()

virtual import_formula *get_formula() Get an interface for importing formula cells. Returns pointer to the formula interface object, or nullptr if the implementer doesn’t support importing of formula cells. virtual void set_auto(row_t row, col_t col, const char *p, size_t n) = 0 Set raw string value to a cell and have the implementation auto-recognize its data type. Parameters • row – row ID • col – column ID • p – pointer to the first character of the raw string value. • n – size of the raw string value. virtual void set_string(row_t row, col_t col, string_id_t sindex) = 0 Set string value to a cell. Parameters • row – row ID • col – column ID • sindex – 0-based string index in the shared string table. virtual void set_value(row_t row, col_t col, double value) = 0 Set numerical value to a cell. Parameters • row – row ID • col – column ID • value – value being assigned to the cell. virtual void set_bool(row_t row, col_t col, bool value) = 0 Set a boolean value to a cell. Parameters • row – row ID • col – col ID • value – boolean value being assigned to the cell virtual void set_date_time(row_t row, col_t col, int year, int month, int day, int hour, int minute, double second) = 0 Set date and time value to a cell. Parameters • row – row ID • col – column ID

72 Chapter 2. C++ API Orcus Documentation, Release 0.16

• year – 1-based value representing year • month – 1-based value representing month, varying from 1 through 12. • day – 1-based value representing day, varying from 1 through 31. • hour – the hour of a day, ranging from 0 through 23. • minute – the minute of an hour, ranging from 0 through 59. • second – the second of a minute, ranging from 0 through 59. virtual void set_format(row_t row, col_t col, size_t xf_index) = 0 Set cell format to specified cell. The cell format is referred to by the xf (cell format) index in thestyles table. Parameters • row – row ID • col – column ID • xf_index – 0-based xf (cell format) index virtual void set_format(row_t row_start, col_t col_start, row_t row_end, col_t col_end, size_t xf_index) = 0 Set cell format to specified cell range. The cell format is referred to by the xf (cell format) index inthe styles table. Parameters • row_start – start row ID • col_start – start column ID • row_end – end row ID • col_end – end column ID • xf_index – 0-based xf (cell format) index virtual void fill_down_cells(row_t src_row, col_t src_col, row_t range_size) = 0 Duplicate the value of the source cell to one or more cells located immediately below it. Parameters • src_row – row ID of the source cell • src_col – column ID of the source cell • range_size – number of cells below the source cell to copy the source cell value to. It must be at least one. virtual range_size_t get_sheet_size() const = 0 Get the size of the sheet. Returns structure containing the numbers of rows and columns of the sheet.

2.2. Types and Interfaces 73 Orcus Documentation, Release 0.16 import_sheet_properties class orcus::spreadsheet::iface::import_sheet_properties Interface for importing sheet properties. Sheet properties are those that are used for decorative purposes but are not necessarily a part of the sheet cell values.

Public Functions

virtual ~import_sheet_properties() = 0

virtual void set_column_width(col_t col, double width, orcus::length_unit_t unit) = 0

virtual void set_column_hidden(col_t col, bool hidden) = 0

virtual void set_row_height(row_t row, double height, orcus::length_unit_t unit) = 0

virtual void set_row_hidden(row_t row, bool hidden) = 0

virtual void set_merge_cell_range(const range_t &range) = 0 Specify merged cell range. Parameters range – structure containing the top-left and bottom-right positions of the merged cell range. import_sheet_view class orcus::spreadsheet::iface::import_sheet_view

Public Functions

virtual ~import_sheet_view()

virtual void set_sheet_active() = 0 Set this sheet as the active sheet. virtual void set_split_pane(double hor_split, double ver_split, const address_t &top_left_cell, sheet_pane_t active_pane) = 0 Set the information about split view in the current sheet. Parameters • hor_split – horizontal position of the split in 1/20th of a point, or 0 if none. “Horizontal” in this case indicates the column direction. • ver_split – vertical position of the split in 1/20th of a point, or 0 if none. “Vertical” in this case indicates the row direction. • top_left_cell – the top left visible cell in the bottom right pane. • active_pane – active pane in this sheet.

74 Chapter 2. C++ API Orcus Documentation, Release 0.16

virtual void set_frozen_pane(col_t visible_columns, row_t visible_rows, const address_t &top_left_cell, sheet_pane_t active_pane) = 0 Set the information about frozen view in the current sheet. Parameters • visible_columns – number of visible columns in the left pane. • visible_rows – number of visible rows in the top pane. • top_left_cell – the top left visible cell in the bottom right pane. • active_pane – active pane in this sheet. virtual void set_selected_range(sheet_pane_t pane, range_t range) = 0 Set the selected cursor range in a specified sheet pane. Parameters • pane – sheet pane associated with the selection. The top-left pane is used for a non-split sheet view. • range – selected cursor range. The range will be 1 column by 1 row when the cursor is on a single cell only. import_styles class orcus::spreadsheet::iface::import_styles Interface for styles. Note that because the default style must have an index of 0 in each style category, the caller must set the default styles first before importing and setting real styles. ID’s of styles are assigned sequentially starting with 0 and upward in each style category. In contrast to xf formatting, dxf (differential formats) formatting only stores the format information that is ex- plicitly set. It does not store formatting from the default style. Applying a dxf format to an object only applies those explicitly set formats from the dxf entry, while all the other formats are retained. Subclassed by orcus::spreadsheet::import_styles

Public Functions

virtual ~import_styles() = 0

virtual void set_font_count(size_t n) = 0

virtual void set_font_bold(bool b) = 0

virtual void set_font_italic(bool b) = 0

virtual void set_font_name(const char *s, size_t n) = 0

virtual void set_font_size(double point) = 0

virtual void set_font_underline(underline_t e) = 0

2.2. Types and Interfaces 75 Orcus Documentation, Release 0.16

virtual void set_font_underline_width(underline_width_t e) = 0

virtual void set_font_underline_mode(underline_mode_t e) = 0

virtual void set_font_underline_type(underline_type_t e) = 0

virtual void set_font_underline_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0

virtual void set_font_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0

virtual void set_strikethrough_style(strikethrough_style_t s) = 0

virtual void set_strikethrough_type(strikethrough_type_t s) = 0

virtual void set_strikethrough_width(strikethrough_width_t s) = 0

virtual void set_strikethrough_text(strikethrough_text_t s) = 0

virtual size_t commit_font() = 0

virtual void set_fill_count(size_t n) = 0 Set the total number of fill styles. This call is not strictly required but may slightly improve performance. Parameters n – number of fill styles. virtual void set_fill_pattern_type(fill_pattern_t fp) = 0 Set the type of fill pattern. Parameters fp – fill pattern type. virtual void set_fill_fg_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Set the foreground color of a fill. Note that for a solid fill type, the foreground color will be used. Parameters • alpha – alpha component ranging from 0 (fully transparent) to 255 (fully opaque). • red – red component ranging from 0 to 255. • green – green component ranging from 0 to 255. • blue – blue component ranging from 0 to 255. virtual void set_fill_bg_color(color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0 Set the background color of a fill. Note that this color will be ignored for a solid fill type. Parameters • alpha – alpha component ranging from 0 (fully transparent) to 255 (fully opaque). • red – red component ranging from 0 to 255.

76 Chapter 2. C++ API Orcus Documentation, Release 0.16

• green – green component ranging from 0 to 255. • blue – blue component ranging from 0 to 255. virtual size_t commit_fill() = 0 Commit the fill style currently in the buffer. Returns the ID of the committed fill style, to be passed on to the set_xf_fill() method asits argument. virtual void set_border_count(size_t n) = 0

virtual void set_border_style(border_direction_t dir, border_style_t style) = 0

virtual void set_border_color(border_direction_t dir, color_elem_t alpha, color_elem_t red, color_elem_t green, color_elem_t blue) = 0

virtual void set_border_width(border_direction_t dir, double width, orcus::length_unit_t unit) = 0

virtual size_t commit_border() = 0

virtual void set_cell_hidden(bool b) = 0

virtual void set_cell_locked(bool b) = 0

virtual void set_cell_print_content(bool b) = 0

virtual void set_cell_formula_hidden(bool b) = 0

virtual size_t commit_cell_protection() = 0

virtual void set_number_format_count(size_t n) = 0

virtual void set_number_format_identifier(size_t id) = 0

virtual void set_number_format_code(const char *s, size_t n) = 0

virtual size_t commit_number_format() = 0

virtual void set_cell_xf_count(size_t n) = 0

virtual void set_cell_style_xf_count(size_t n) = 0

virtual void set_dxf_count(size_t n) = 0

virtual void set_xf_font(size_t index) = 0

2.2. Types and Interfaces 77 Orcus Documentation, Release 0.16

virtual void set_xf_fill(size_t index) = 0

virtual void set_xf_border(size_t index) = 0

virtual void set_xf_protection(size_t index) = 0

virtual void set_xf_number_format(size_t index) = 0

virtual void set_xf_style_xf(size_t index) = 0

virtual void set_xf_apply_alignment(bool b) = 0

virtual void set_xf_horizontal_alignment(hor_alignment_t align) = 0

virtual void set_xf_vertical_alignment(ver_alignment_t align) = 0

virtual size_t commit_cell_xf() = 0

virtual size_t commit_cell_style_xf() = 0

virtual size_t commit_dxf() = 0

virtual void set_cell_style_count(size_t n) = 0

virtual void set_cell_style_name(const char *s, size_t n) = 0

virtual void set_cell_style_xf(size_t index) = 0

virtual void set_cell_style_builtin(size_t index) = 0

virtual void set_cell_style_parent_name(const char *s, size_t n) = 0

virtual size_t commit_cell_style() = 0

import_table class orcus::spreadsheet::iface::import_table Interface for table. A table is a range within a sheet that consists of one or more data columns with a header row that contains their labels.

78 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Functions

virtual ~import_table() = 0

virtual import_auto_filter *get_auto_filter()

virtual void set_identifier(size_t id) = 0

virtual void set_range(const char *p_ref, size_t n_ref) = 0

virtual void set_totals_row_count(size_t row_count) = 0

virtual void set_name(const char *p, size_t n) = 0

virtual void set_display_name(const char *p, size_t n) = 0

virtual void set_column_count(size_t n) = 0

virtual void set_column_identifier(size_t id) = 0

virtual void set_column_name(const char *p, size_t n) = 0

virtual void set_column_totals_row_label(const char *p, size_t n) = 0

virtual void set_column_totals_row_function(totals_row_function_t func) = 0

virtual void commit_column() = 0

virtual void set_style_name(const char *p, size_t n) = 0

virtual void set_style_show_first_column(bool b) = 0

virtual void set_style_show_last_column(bool b) = 0

virtual void set_style_show_row_stripes(bool b) = 0

virtual void set_style_show_column_stripes(bool b) = 0

virtual void commit() = 0

2.2. Types and Interfaces 79 Orcus Documentation, Release 0.16 export_factory class orcus::spreadsheet::iface::export_factory Subclassed by orcus::spreadsheet::export_factory

Public Functions

virtual ~export_factory() = 0

virtual const export_sheet *get_sheet(const char *sheet_name, size_t sheet_name_length) const = 0

export_sheet class orcus::spreadsheet::iface::export_sheet

Public Functions

virtual ~export_sheet() = 0

virtual void write_string(std::ostream &os, orcus::spreadsheet::row_t row, orcus::spreadsheet::col_t col) const = 0

2.2.3 Spreadsheet Types

Types typedef int32_t orcus::spreadsheet::row_t typedef int32_t orcus::spreadsheet::col_t typedef int32_t orcus::spreadsheet::sheet_t typedef uint8_t orcus::spreadsheet::color_elem_t typedef uint16_t orcus::spreadsheet::col_width_t typedef uint16_t orcus::spreadsheet::row_height_t typedef uint32_t orcus::spreadsheet::pivot_cache_id_t

80 Chapter 2. C++ API Orcus Documentation, Release 0.16

Structs struct orcus::spreadsheet::underline_attrs_t

Public Members

underline_t underline_style underline_width_t underline_width underline_mode_t underline_mode underline_type_t underline_type struct orcus::spreadsheet::address_t

Public Members

row_t row col_t column struct orcus::spreadsheet::range_size_t

Public Members

row_t rows col_t columns struct orcus::spreadsheet::range_t

Public Members

address_t first address_t last struct orcus::spreadsheet::color_rgb_t

Public Functions

color_rgb_t()

color_rgb_t(std::initializer_list vs)

color_rgb_t(const color_rgb_t &other)

2.2. Types and Interfaces 81 Orcus Documentation, Release 0.16

color_rgb_t(color_rgb_t &&other)

color_rgb_t &operator=(const color_rgb_t &other)

Public Members

color_elem_t red color_elem_t green color_elem_t blue

Enums

enum orcus::spreadsheet::error_value_t Values:

enumerator unknown enumerator null enumerator div0 enumerator value enumerator ref enumerator name enumerator num enumerator na enum orcus::spreadsheet::border_direction_t Values:

enumerator unknown enumerator top enumerator bottom enumerator left enumerator right enumerator diagonal enumerator diagonal_bl_tr enumerator diagonal_tl_br enum orcus::spreadsheet::border_style_t Values:

enumerator unknown enumerator none

82 Chapter 2. C++ API Orcus Documentation, Release 0.16

enumerator solid enumerator dash_dot enumerator dash_dot_dot enumerator dashed enumerator dotted enumerator double_border enumerator hair enumerator medium enumerator medium_dash_dot enumerator medium_dash_dot_dot enumerator medium_dashed enumerator slant_dash_dot enumerator thick enumerator thin enumerator double_thin enumerator fine_dashed enum orcus::spreadsheet::fill_pattern_t Values:

enumerator none enumerator solid enumerator dark_down enumerator dark_gray enumerator dark_grid enumerator dark_horizontal enumerator dark_trellis enumerator dark_up enumerator dark_vertical enumerator gray_0625 enumerator gray_125 enumerator light_down enumerator light_gray enumerator light_grid enumerator light_horizontal enumerator light_trellis enumerator light_up enumerator light_vertical

2.2. Types and Interfaces 83 Orcus Documentation, Release 0.16

enumerator medium_gray enum orcus::spreadsheet::strikethrough_style_t Values:

enumerator none enumerator solid enumerator dash enumerator dot_dash enumerator dot_dot_dash enumerator dotted enumerator long_dash enumerator wave enum orcus::spreadsheet::strikethrough_type_t Values:

enumerator unknown enumerator none enumerator single enumerator double_type enum orcus::spreadsheet::strikethrough_width_t Values:

enumerator unknown enumerator width_auto enumerator thin enumerator medium enumerator thick enumerator bold enum orcus::spreadsheet::strikethrough_text_t Values:

enumerator unknown enumerator slash enumerator cross enum orcus::spreadsheet::formula_grammar_t Type that specifies the grammar of a formula expression. Each grammar may exhibit a different setofsyntax rules. Values:

enumerator unknown Grammar type is either unknown or unspecified.

84 Chapter 2. C++ API Orcus Documentation, Release 0.16

enumerator xls_xml Grammar used by the Excel 2003 XML (aka XML Spreadsheet) format.

enumerator xlsx Grammar used by the Office Open XML spreadsheet format.

enumerator ods Grammar used by the OpenDocument Spreadsheet format.

enumerator gnumeric Grammar used by the Gnumeric XML format.

enum orcus::spreadsheet::formula_t Values:

enumerator unknown enumerator array enumerator data_table enumerator normal enumerator shared enum orcus::spreadsheet::underline_t Values:

enumerator none enumerator single_line enumerator single_accounting enumerator double_line enumerator double_accounting enumerator dotted enumerator dash enumerator long_dash enumerator dot_dash enumerator dot_dot_dot_dash enumerator wave enum orcus::spreadsheet::underline_width_t Values:

enumerator none enumerator normal enumerator bold enumerator thin enumerator medium

2.2. Types and Interfaces 85 Orcus Documentation, Release 0.16

enumerator thick enumerator positive_integer enumerator percent enumerator positive_length enum orcus::spreadsheet::underline_mode_t Values:

enumerator continuos enumerator skip_white_space enum orcus::spreadsheet::underline_type_t Values:

enumerator none enumerator single enumerator double_type enum orcus::spreadsheet::hor_alignment_t Values:

enumerator unknown enumerator left enumerator center enumerator right enumerator justified enumerator distributed enumerator filled enum orcus::spreadsheet::ver_alignment_t Values:

enumerator unknown enumerator top enumerator middle enumerator bottom enumerator justified enumerator distributed enum orcus::spreadsheet::data_table_type_t Type of data table. A data table can be either of a single-variable column, a single-variable row, or a double- variable type that uses both column and row input cells. Values:

enumerator column enumerator row

86 Chapter 2. C++ API Orcus Documentation, Release 0.16

enumerator both enum orcus::spreadsheet::totals_row_function_t Function type used in the totals row of a table. Values:

enumerator none enumerator sum enumerator minimum enumerator maximum enumerator average enumerator count enumerator count_numbers enumerator standard_deviation enumerator variance enumerator custom enum orcus::spreadsheet::conditional_format_t Values:

enumerator unknown enumerator condition enumerator date enumerator formula enumerator colorscale enumerator databar enumerator iconset enum orcus::spreadsheet::condition_operator_t Values:

enumerator unknown enumerator equal enumerator less enumerator greater enumerator greater_equal enumerator less_equal enumerator not_equal enumerator between enumerator not_between enumerator duplicate

2.2. Types and Interfaces 87 Orcus Documentation, Release 0.16

enumerator unique enumerator top_n enumerator bottom_n enumerator above_average enumerator below_average enumerator above_equal_average enumerator below_equal_average enumerator contains_error enumerator contains_no_error enumerator begins_with enumerator ends_with enumerator contains enumerator contains_blanks enumerator not_contains enumerator expression enum orcus::spreadsheet::condition_type_t Values:

enumerator unknown enumerator value enumerator automatic enumerator max enumerator min enumerator formula enumerator percent enumerator percentile enum orcus::spreadsheet::condition_date_t Values:

enumerator unknown enumerator today enumerator yesterday enumerator tomorrow enumerator last_7_days enumerator this_week enumerator next_week enumerator last_week enumerator this_month

88 Chapter 2. C++ API Orcus Documentation, Release 0.16

enumerator next_month enumerator last_month enumerator this_year enumerator next_year enumerator last_year enum orcus::spreadsheet::databar_axis_t Values:

enumerator none enumerator middle enumerator automatic enum orcus::spreadsheet::pivot_cache_group_by_t Values:

enumerator unknown enumerator days enumerator hours enumerator minutes enumerator months enumerator quarters enumerator range enumerator seconds enumerator years

2.2.4 Spreadsheet Global Functions col_width_t orcus::spreadsheet::get_default_column_width() row_height_t orcus::spreadsheet::get_default_row_height() totals_row_function_t orcus::spreadsheet::to_totals_row_function_enum(const char *p, size_t n) Convert a string representation of a totals row function name to its equivalent enum value. Parameters • p – pointer to the string buffer. • n – size of the string buffer. Returns enum value representing the totals row function. pivot_cache_group_by_t orcus::spreadsheet::to_pivot_cache_group_by_enum(const char *p, size_t n) Convert a string representation of a pivot cache group-by type to its equivalent enum value. Parameters

2.2. Types and Interfaces 89 Orcus Documentation, Release 0.16

• p – pointer to the string buffer. • n – size of the string buffer. Returns enum value representing the pivot cache group-by type. error_value_t orcus::spreadsheet::to_error_value_enum(const char *p, size_t n) Convert a string representation of a error value to its equivalent enum value. Parameters • p – pointer to the string buffer. • n – size of the string buffer. Returns enum value representing the error value. color_rgb_t orcus::spreadsheet::to_color_rgb(const char *p, size_t n) Convert a string representation of a RGB value to an equivalent struct value. The string representation is expected to be a 6 digit hexadecimal value string that may or may not be prefixed with a ‘#’. Parameters • p – pointer to the string buffer that stores the string representation of the RGB value. • n – length of the buffer. Returns struct value representing an RGB value.

2.3 Spreadsheet Import Filters

2.3.1 Plain Text (CSV) class orcus::orcus_csv : public orcus::iface::import_filter

Public Functions

orcus_csv(spreadsheet::iface::import_factory *factory)

virtual void read_file(const std::string &filepath) expects a system path to a local file virtual void read_stream(const char *content, size_t len) expects the whole content of the file virtual const char *get_name() const

90 Chapter 2. C++ API Orcus Documentation, Release 0.16

2.3.2 Open Document Spreadsheet

class orcus::orcus_ods : public orcus::iface::import_filter

Public Functions

orcus_ods(spreadsheet::iface::import_factory *factory)

~orcus_ods()

virtual void read_file(const std::string &filepath) expects a system path to a local file virtual void read_stream(const char *content, size_t len) expects the whole content of the file virtual const char *get_name() const

Public Static Functions

static bool detect(const unsigned char *blob, size_t size) class orcus::import_ods

Public Static Functions

static void read_styles(const char *p, size_t n, spreadsheet::iface::import_styles *data)

2.3.3 2003 XML

class orcus::orcus_xls_xml : public orcus::iface::import_filter

Public Functions

orcus_xls_xml(spreadsheet::iface::import_factory *factory)

~orcus_xls_xml()

orcus_xls_xml(const orcus_xls_xml&) = delete

orcus_xls_xml &operator=(const orcus_xls_xml&) = delete

2.3. Spreadsheet Import Filters 91 Orcus Documentation, Release 0.16

virtual void read_file(const std::string &filepath) expects a system path to a local file virtual void read_stream(const char *content, size_t len) expects the whole content of the file virtual const char *get_name() const

Public Static Functions

static bool detect(const unsigned char *blob, size_t size)

2.3.4 Microsoft Excel 2007 XML class orcus::orcus_xlsx : public orcus::iface::import_filter

Public Functions

orcus_xlsx(spreadsheet::iface::import_factory *factory)

~orcus_xlsx()

orcus_xlsx(const orcus_xlsx&) = delete

orcus_xlsx &operator=(const orcus_xlsx&) = delete

virtual void read_file(const std::string &filepath) expects a system path to a local file virtual void read_stream(const char *content, size_t len) expects the whole content of the file virtual const char *get_name() const

Public Static Functions

static bool detect(const unsigned char *blob, size_t size) class orcus::import_xlsx

92 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Static Functions

static void read_table(const char *p, size_t n, spreadsheet::iface::import_table &table, spreadsheet::iface::import_reference_resolver &resolver)

2.3.5 Gnumeric XML

class orcus::orcus_gnumeric : public orcus::iface::import_filter

Public Functions

orcus_gnumeric(spreadsheet::iface::import_factory *factory)

~orcus_gnumeric()

virtual void read_file(const std::string &filepath) expects a system path to a local file virtual void read_stream(const char *content, size_t len) expects the whole content of the file virtual const char *get_name() const

Public Static Functions

static bool detect(const unsigned char *blob, size_t size)

2.3.6 Generic XML class orcus::orcus_xml

Public Functions

orcus_xml(const orcus_xml&) = delete

orcus_xml &operator=(const orcus_xml&) = delete

orcus_xml(xmlns_repository &ns_repo, spreadsheet::iface::import_factory *im_fact, spreadsheet::iface::export_factory *ex_fact)

~orcus_xml()

2.3. Spreadsheet Import Filters 93 Orcus Documentation, Release 0.16

void set_namespace_alias(const pstring &alias, const pstring &uri, bool default_ns = false) Define a namespace and its alias used in a map file. Parameters • alias – alias for the namespace. • uri – namespace value. • default_ns – whether or not to use this namespace as the default namespace. When this value is set to true, the namespace being set will be applied for all elements and attributes used in the paths without explicit namespace values. void set_cell_link(const pstring &xpath, const pstring &sheet, spreadsheet::row_t row, spreadsheet::col_t col) Define a mapping of a single element or attribute to a single cell location. Parameters • xpath – path to the element or attribute to link. • sheet – sheet index (0-based) of the linked cell location. • row – row index (0-based) of the linked cell location. • col – column index (0-based) of the linked cell location. void start_range(const pstring &sheet, spreadsheet::row_t row, spreadsheet::col_t col) Initiate the mapping definition of a linked range. The definition will get committed whenthe commit_range method is called. Parameters • sheet – sheet index (0-based) of the linked cell location. • row – row index (0-based) of the linked cell location. • col – column index (0-based) of the linked cell location. void append_field_link(const pstring &xpath, const pstring &label) Append a field that is mapped to a specified path in the XML document to the current linkedrange. Parameters • xpath – path to the element or attribute to link as a field. • label – custom header label to use in lieu of the name of the linked entity. void set_range_row_group(const pstring &xpath) Set the element located in the specified path as a row group in the current linked range. If the element is defined as a row-group element, the row index will increment whenever that element closes. Parameters xpath – path to the element to use as a row group element. void commit_range() Commit the mapping definition of the current range. void append_sheet(const pstring &name) Append a new sheet to the spreadsheet document. Parameters name – name of the sheet. void read_stream(const char *p, size_t n) Read the stream containing the source XML document. Parameters

94 Chapter 2. C++ API Orcus Documentation, Release 0.16

• p – pointer to the buffer containing the source XML document. • n – size of the buffer. void read_map_definition(const char *p, size_t n) Read an XML stream that contains an entire set of mapping rules. This method also inserts all necessary sheets into the document model. Parameters • p – pointer to the buffer that contains the XML string. • n – size of the buffer. void detect_map_definition(const char *p, size_t n) Read a stream containing the source XML document, automatically detect all linkable ranges and import them one range per sheet. Parameters • p – pointer to the buffer that contains the source XML document. • n – size of the buffer. void write_map_definition(const char *p, size_t n, std::ostream &out) const Read a stream containing the source XML document, automatically detect all linkable ranges, and write a map definition file depicting the detected ranges. Parameters • p – pointer to the buffer that contains the source XML document. • n – size of the buffer. • out – output stream to write the map definition file to. void write(const char *p_in, size_t n_in, std::ostream &out) const Write the linked cells and ranges in the spreadsheet document as an XML document using the same map definition rules used to load the content. Note that this requires the source XML document stream, as it re-uses parts of the source stream. Parameters • p_in – pointer to the buffer that contains the source XML document. • n_in – size of the buffer containing the source XML document. • out – output stream to write the XML document to.

2.4 Document Model

2.4.1 Spreadsheet Document

Document class orcus::spreadsheet::document : public orcus::iface::document_dumper Internal document representation used only for testing the filters. It uses ixion’s model_context implementation to store raw cell values.

2.4. Document Model 95 Orcus Documentation, Release 0.16

Public Functions

document(const document&) = delete

document &operator=(const document&) = delete

document(const range_size_t &sheet_size)

~document()

import_shared_strings *get_shared_strings()

const import_shared_strings *get_shared_strings() const

styles &get_styles()

const styles &get_styles() const

pivot_collection &get_pivot_collection()

const pivot_collection &get_pivot_collection() const

sheet *append_sheet(const pstring &sheet_name)

sheet *get_sheet(const pstring &sheet_name)

const sheet *get_sheet(const pstring &sheet_name) const

sheet *get_sheet(sheet_t sheet_pos)

const sheet *get_sheet(sheet_t sheet_pos) const

void clear() Clear document content, to make it empty. void recalc_formula_cells() Calculate those formula cells that have been newly inserted and have not yet been calculated. virtual void dump(dump_format_t format, const std::string &output) const override

void dump_flat(const std::string &outdir) const Dump document content to specified output directory in flat format. Parameters outdir – path to the output directory. void dump_html(const ::std::string &outdir) const Dump document content to specified output directory in format.

96 Chapter 2. C++ API Orcus Documentation, Release 0.16

Parameters outdir – path to the output directory. void dump_json(const ::std::string &outdir) const Dump document content to specified output directory in json format. Parameters outdir – path to the output directory. void dump_csv(const std::string &outdir) const Dump document content to specified output directory in csv format. Parameters outdir – path to the output directory. virtual void dump_check(std::ostream &os) const override Dump document content to stdout in the special format used for content verification during unit test. sheet_t get_sheet_index(const pstring &name) const

pstring get_sheet_name(sheet_t sheet_pos) const

range_size_t get_sheet_size() const

void set_sheet_size(const range_size_t &sheet_size)

size_t get_sheet_count() const

void set_origin_date(int year, int month, int day)

date_time_t get_origin_date() const

void set_formula_grammar(formula_grammar_t grammar)

formula_grammar_t get_formula_grammar() const

const ixion::formula_name_resolver *get_formula_name_resolver(formula_ref_context_t cxt) const

ixion::model_context &get_model_context()

const ixion::model_context &get_model_context() const

const document_configget_config & () const

void set_config(const document_config &cfg)

string_pool &get_string_pool()

void insert_table(table_t *p) Insert a new table object into the document. The document will take ownership of the inserted object after the call. The object will get inserted only when there is no pre-existing table object of the same name. The object not being inserted will be deleted.

2.4. Document Model 97 Orcus Documentation, Release 0.16

Parameters p – table object to insert. const table_t *get_table(const pstring &name) const

void finalize()

Sheet class orcus::spreadsheet::sheet This class represents a single sheet instance in the internal document model.

Public Functions

sheet(document &doc, sheet_t sheet_index)

virtual ~sheet()

void set_auto(row_t row, col_t col, const char *p, size_t n)

void set_string(row_t row, col_t col, string_id_t sindex)

void set_value(row_t row, col_t col, double value)

void set_bool(row_t row, col_t col, bool value)

void set_date_time(row_t row, col_t col, int year, int month, int day, int hour, int minute, double second)

void set_format(row_t row, col_t col, size_t index)

void set_format(row_t row_start, col_t col_start, row_t row_end, col_t col_end, size_t index)

void set_formula(row_t row, col_t col, const ixion::formula_tokens_store_ptr_t &tokens)

void set_formula(row_t row, col_t col, const ixion::formula_tokens_store_ptr_t &tokens, ixion::formula_result result)

void set_grouped_formula(const range_t &range, ixion::formula_tokens_t tokens)

void set_grouped_formula(const range_t &range, ixion::formula_tokens_t tokens, ixion::formula_result result)

void set_col_width(col_t col, col_width_t width)

98 Chapter 2. C++ API Orcus Documentation, Release 0.16

col_width_t get_col_width(col_t col, col_t *col_start, col_t *col_end) const

void set_col_hidden(col_t col, bool hidden)

bool is_col_hidden(col_t col, col_t *col_start, col_t *col_end) const

void set_row_height(row_t row, row_height_t height)

row_height_t get_row_height(row_t row, row_t *row_start, row_t *row_end) const

void set_row_hidden(row_t row, bool hidden)

bool is_row_hidden(row_t row, row_t *row_start, row_t *row_end) const

void set_merge_cell_range(const range_t &range)

void fill_down_cells(row_t src_row, col_t src_col, row_t range_size)

range_t get_merge_cell_range(row_t row, col_t col) const Return the size of a merged cell range. Parameters • row – row position of the upper-left cell. • col – column position of the upper-left cell. Returns merged cell range. size_t get_string_identifier(row_t row, col_t col) const

auto_filter_tget_auto_filter_data * ()

const auto_filter_t *get_auto_filter_data() const

void set_auto_filter_data(auto_filter_t )*p

ixion::abs_range_t get_data_range() const Return the smallest range that contains all non-empty cells in this sheet. The top-left corner of the returned range is always column 0 and row 0. Returns smallest range that contains all non-empty cells. sheet_t get_index() const

date_time_t get_date_time(row_t row, col_t col) const

void finalize()

2.4. Document Model 99 Orcus Documentation, Release 0.16

void dump_flat(std::ostream &os) const

void dump_check(std::ostream &os, const pstring &sheet_name) const

void dump_html(std::ostream &os) const

void dump_json(std::ostream &os) const

void dump_csv(std::ostream &os) const

size_t get_cell_format(row_t row, col_t col) const Get the cell format ID of specified cell.

Pivot Table struct orcus::spreadsheet::pivot_cache_record_value_t

Public Types

enum value_type Values:

enumerator unknown enumerator boolean enumerator date_time enumerator character enumerator numeric enumerator blank enumerator error enumerator shared_item_index

Public Functions

pivot_cache_record_value_t()

pivot_cache_record_value_t(const char *cp, size_t cn)

pivot_cache_record_value_t(double v)

pivot_cache_record_value_t(size_t index)

100 Chapter 2. C++ API Orcus Documentation, Release 0.16

bool operator==(const pivot_cache_record_value_t &other) const

bool operator!=(const pivot_cache_record_value_t &other) const

Public Members

value_type type bool boolean const char *p size_t n struct orcus::spreadsheet::pivot_cache_record_value_t::[anonymous]::[anonymous] character int year int month int day int hour int minute double second struct orcus::spreadsheet::pivot_cache_record_value_t::[anonymous]::[anonymous] date_time double numeric size_t shared_item_index union orcus::spreadsheet::pivot_cache_record_value_t::[anonymous] value struct orcus::spreadsheet::pivot_cache_item_t

Public Types

enum item_type Values:

enumerator unknown enumerator boolean enumerator date_time enumerator character enumerator numeric enumerator blank enumerator error

2.4. Document Model 101 Orcus Documentation, Release 0.16

Public Functions

pivot_cache_item_t()

pivot_cache_item_t(const char *cp, size_t cn)

pivot_cache_item_t(double numeric)

pivot_cache_item_t(bool boolean)

pivot_cache_item_t(const date_time_t &date_time)

pivot_cache_item_t(error_value_t error)

pivot_cache_item_t(const pivot_cache_item_t &other)

pivot_cache_item_t(pivot_cache_item_t &&other)

bool operator<(const pivot_cache_item_t &other) const

bool operator==(const pivot_cache_item_t &other) const

pivot_cache_item_t &operator=(pivot_cache_item_t other)

void swap(pivot_cache_item_t &other)

Public Members

item_type type const char *p size_t n struct orcus::spreadsheet::pivot_cache_item_t::[anonymous]::[anonymous] character int year int month int day int hour int minute double second struct orcus::spreadsheet::pivot_cache_item_t::[anonymous]::[anonymous] date_time double numeric

102 Chapter 2. C++ API Orcus Documentation, Release 0.16

error_value_t error bool boolean union orcus::spreadsheet::pivot_cache_item_t::[anonymous] value struct orcus::spreadsheet::pivot_cache_group_data_t Group data for a pivot cache field.

Public Functions

pivot_cache_group_data_t(size_t _base_field)

pivot_cache_group_data_t(const pivot_cache_group_data_t &other)

pivot_cache_group_data_t(pivot_cache_group_data_t &&other)

pivot_cache_group_data_t() = delete

Public Members

pivot_cache_indices_t base_to_group_indices Mapping of base field member indices to the group field item indices.

boost::optional range_grouping pivot_cache_items_t items Individual items comprising the group.

size_t base_field 0-based index of the base field.

struct range_grouping_type

Public Functions

range_grouping_type() = default

range_grouping_type(const range_grouping_type &other) = default

2.4. Document Model 103 Orcus Documentation, Release 0.16

Public Members

pivot_cache_group_by_t group_by = pivot_cache_group_by_t::range bool auto_start = true bool auto_end = true double start = 0.0 double end = 0.0 double interval = 1.0 date_time_t start_date date_time_t end_date struct orcus::spreadsheet::pivot_cache_field_t

Public Functions

pivot_cache_field_t()

pivot_cache_field_t(const pstring &_name)

pivot_cache_field_t(const pivot_cache_field_t &other)

pivot_cache_field_t(pivot_cache_field_t &&other)

Public Members

pstring name Field name. It must be interned with the string pool belonging to the document.

pivot_cache_items_t items boost::optional min_value boost::optional max_value boost::optional min_date boost::optional max_date std::unique_ptr group_data class orcus::spreadsheet::pivot_cache

104 Chapter 2. C++ API Orcus Documentation, Release 0.16

Public Types

using fields_type = std::vector using records_type = std::vector

Public Functions

pivot_cache(pivot_cache_id_t cache_id, string_pool &sp)

~pivot_cache()

void insert_fields(fields_type fields) Bulk-insert all the fields in one step. Note that this will replace any pre-existing fields ifany. Parameters fields – field instances to move into storage. void insert_records(records_type record)

size_t get_field_count() const

const pivot_cache_field_t *get_field(size_t index) const Retrieve a field data by its index. Parameters index – index of the field to retrieve. Returns pointer to the field instance, or nullptr if the index is out-of-range. pivot_cache_id_t get_id() const

const records_type &get_all_records() const class orcus::spreadsheet::pivot_collection

Public Functions

pivot_collection(document &doc)

~pivot_collection()

void insert_worksheet_cache(const pstring &sheet_name, const ixion::abs_range_t &range, std::unique_ptr &&cache) Insert a new pivot cache associated with a worksheet source. Parameters • sheet_name – name of the sheet where the source data is. • range – range of the source data. Note that the sheet indices are not used. • cache – pivot cache instance to store.

2.4. Document Model 105 Orcus Documentation, Release 0.16

void insert_worksheet_cache(const pstring &table_name, std::unique_ptr &&cache) Insert a new pivot cache associated with a table name. Parameters • table_name – source table name. • cache – pivot cache instance to store. size_t get_cache_count() const Count the number of pivot caches currently stored. Returns number of pivot caches currently stored in the document. const pivot_cache *get_cache(const pstring &sheet_name, const ixion::abs_range_t &range) const

pivot_cache *get_cache(pivot_cache_id_t cache_id)

const pivot_cache *get_cache(pivot_cache_id_t cache_id) const

Import Factory class orcus::spreadsheet::import_factory : public orcus::spreadsheet::iface::import_factory

Public Functions

import_factory(document &doc)

import_factory(document &doc, view &view)

virtual ~import_factory()

virtual iface::import_global_settings *get_global_settings() override

virtual iface::import_shared_strings *get_shared_strings() override

Returns pointer to the shared strings instance. It may return NULL if the client app doesn’t support shared strings. virtual iface::import_styles *get_styles() override

Returns pointer to the styles instance. It may return NULL if the client app doesn’t support styles. virtual iface::import_named_expression *get_named_expression() override

virtual iface::import_reference_resolver *get_reference_resolver(formula_ref_context_t cxt) override

106 Chapter 2. C++ API Orcus Documentation, Release 0.16

virtual iface::import_pivot_cache_definition *create_pivot_cache_definition(orcus::spreadsheet::pivot_cache_id_t cache_id) override Create an interface for pivot cache definition import for a specified cache ID. In case a pivot cachealrady exists for the passed ID, the client app should overwrite the existing cache with a brand-new cache instance. Parameters cache_id – numeric ID associated with the pivot cache. Returns pointer to the pivot cache interface instance. If may return NULL if the client app doesn’t support pivot tables. virtual iface::import_pivot_cache_records *create_pivot_cache_records(orcus::spreadsheet::pivot_cache_id_t cache_id) override Create an interface for pivot cache records import for a specified cache ID. Parameters cache_id – numeric ID associated with the pivot cache. Returns pointer to the pivot cache records interface instance. If may return nullptr if the client app doesn’t support pivot tables. virtual iface::import_sheet *append_sheet(sheet_t sheet_index, const char *sheet_name, size_t sheet_name_length) override Append a sheet with specified sheet position index and name. Parameters • sheet_index – position index of the sheet to be appended. It is 0-based i.e. the first sheet to be appended will have an index value of 0. • sheet_name – pointer to the first character in the buffer where the sheet name is stored. • sheet_name_length – length of the sheet name. Returns pointer to the sheet instance. It may return nullptr if the client app fails to append a new sheet. virtual iface::import_sheet *get_sheet(const char *sheet_name, size_t sheet_name_length) override

Returns pointer to the sheet instance whose name matches the name passed to this method. It returns nullptr if no sheet instance exists by the specified name. virtual iface::import_sheet *get_sheet(sheet_t sheet_index) override Retrieve sheet instance by specified numerical sheet index. Parameters sheet_index – sheet index Returns pointer to the sheet instance, or nullptr if no sheet instance exists at specified sheet index position. virtual void finalize() override This method is called at the end of import, to give the implementor a chance to perform post-processing if necessary. void set_default_row_size(row_t row_size)

void set_default_column_size(col_t col_size)

void set_character_set(character_set_t charset)

character_set_t get_character_set() const

2.4. Document Model 107 Orcus Documentation, Release 0.16

void set_recalc_formula_cells(bool b) When setting this flag to true, those formula cells with no cached results will be re-calculated upon loading. Parameters b – value of this flag. void set_formula_error_policy(formula_error_policy_t policy)

2.4.2 JSON Document Tree

Document tree class orcus::json::document_tree This class stores a parsed JSON document tree structure.

Public Functions

document_tree()

document_tree(const document_tree&) = delete

document_tree(document_tree &&other)

document_tree(document_resource &res)

document_tree(std::initializer_list vs)

document_tree(array vs)

document_tree(object obj)

~document_tree()

document_tree &operator=(std::initializer_list vs)

document_tree &operator=(array vs)

document_tree &operator=(object obj)

void load(const std::string &strm, const json_config &config) Load raw string stream containing a JSON structure to populate the document tree. Parameters • strm – stream containing a JSON structure. • config – configuration object.

108 Chapter 2. C++ API Orcus Documentation, Release 0.16

void load(const char *p, size_t n, const json_config &config) Load raw string stream containing a JSON structure to populate the document tree. Parameters • p – pointer to the stream containing a JSON structure. • n – size of the stream. • config – configuration object. json::const_node get_document_root() const Get the root node of the document. Returns root node of the document. json::node get_document_root() Get the root node of the document. Returns root node of the document. std::string dump() const Dump the JSON document tree to string. Returns a string representation of the JSON document tree. std::string dump_xml() const Dump the JSON document tree to an XML structure. Returns a string containing an XML structure representing the JSON content. void swap(document_tree &other) Swap the content of the document with another document instance. Parameters other – document instance to swap the content with. struct orcus::json_config

Public Functions

json_config()

~json_config()

Public Members

std::string input_path Path of the JSON file being parsed, in case the JSON string originates from a file. This parameter isrequired if external JSON files need to be resolved. Otherwise it’s optional.

std::string output_path Path of the file to which output is written to. Used only from the orcus-json command linetool.

dump_format_t output_format Output format type. Used only from the orcus-json command line tool.

2.4. Document Model 109 Orcus Documentation, Release 0.16

bool preserve_object_order Control whether or not to preserve the order of object’s child name/value pairs. By definition, JSON’s object is an unordered set of name/value pairs, but in some cases preserving the original order may be desirable.

bool resolve_references Control whether or not to resolve JSON references to external files.

bool persistent_string_values When true, the document tree should allocate memory and hold copies of string values in the tree. When false, no extra memory is allocated for string values in the tree and the string values simply point to the original json string stream. In other words, when this option is set to false, the caller must ensure that the json string stream instance stays alive for the entire life cycle of the document tree. class orcus::json::const_node Each node instance represents a JSON value stored in the document tree. It’s immutable. Subclassed by orcus::json::node

Public Functions

const_node() = delete

const_node(const const_node &other)

const_node(const_node &&rhs)

~const_node()

node_t type() const Get the type of a node. Returns node type. size_t child_count() const Get the number of child nodes if any. Returns number of child nodes. std::vector keys() const Get a list of keys stored in a JSON object node. Throws orcus::json::document_error – if the node is not of the object type. Returns a list of keys. pstring key(size_t index) const Get the key by index in a JSON object node. This method works only when the preserve object order option is set. Parameters index – 0-based key index. Throws

110 Chapter 2. C++ API Orcus Documentation, Release 0.16

• orcus::json::document_error – if the node is not of the object type. • std::out_of_range – if the index is equal to or greater than the number of keys stored in the node. Returns key value. bool has_key(const pstring &key) const Query whether or not a particular key exists in a JSON object node. Parameters key – key value. Returns true if this object node contains the specified key, otherwise false. If this node is notof a JSON object type, false is returned. const_node child(size_t index) const Get a child node by index. Parameters index – 0-based index of a child node. Throws • orcus::json::document_error – if the node is not one of the object or array types. • std::out_of_range – if the index is equal to or greater than the number of child nodes that the node has. Returns child node instance. const_node child(const pstring &key) const Get a child node by textural key value. Parameters key – textural key value to get a child node by. Throws orcus::json::document_error – if the node is not of the object type, or the node doesn’t have the specified key. Returns child node instance. const_node parent() const Get the parent node. Throws orcus::json::document_error – if the node doesn’t have a parent node which im- plies that the node is a root node. Returns parent node instance. const_node back() const Get the last child node. Throws orcus::json::document_error – if the node is not of array type or node has no children. Returns last child node instance. pstring string_value() const Get the string value of a JSON string node. Throws orcus::json::document_error – if the node is not of the string type. Returns string value. double numeric_value() const Get the numeric value of a JSON number node. Throws orcus::json::document_error – if the node is not of the number type.

2.4. Document Model 111 Orcus Documentation, Release 0.16

Returns numeric value. const_node &operator=(const const_node &other)

uintptr_t identity() const Return an indentifier of the JSON value object that the node represents. The identifier is derived directly from the memory address of the value object. Returns identifier of the JSON value object. const_node_iterator begin() const

const_node_iterator end() const class orcus::json::node : public orcus::json::const_node Each node instance represents a JSON value stored in the document tree. This class allows mutable operations.

Public Functions

node() = delete

node(const node &other)

node(node &&rhs)

~node()

node &operator=(const node &other)

node &operator=(const detail::init::node &v)

node operator[](const pstring &key)

node child(size_t index) Get a child node by index. Parameters index – 0-based index of a child node. Throws • orcus::json::document_error – if the node is not one of the object or array types. • std::out_of_range – if the index is equal to or greater than the number of child nodes that the node has. Returns child node instance. node child(const pstring &key) Get a child node by textural key value. Parameters key – textural key value to get a child node by.

112 Chapter 2. C++ API Orcus Documentation, Release 0.16

Throws orcus::json::document_error – if the node is not of the object type, or the node doesn’t have the specified key. Returns child node instance. node parent() Get the parent node. Throws orcus::json::document_error – if the node doesn’t have a parent node which im- plies that the node is a root node. Returns parent node instance. node back() Get the last child node. Throws orcus::json::document_error – if the node is not of array type or node has no children. Returns last child node instance. void push_back(const detail::init::node &v) Append a new node value to the end of the array. Throws orcus::json::document_error – if the node is not of array type. Parameters v – new node value to append to the end of the array. class orcus::json::array This class represents a JSON array, to be used to explicitly create an array instance during initialization.

Public Functions

array()

array(const array&) = delete

array(array &&other)

array(std::initializer_list vs)

~array()

Friends

friend class detail::init::node class orcus::json::object This class represents a JSON object, primarily to be used to create an empty object instance.

2.4. Document Model 113 Orcus Documentation, Release 0.16

Public Functions

object()

object(const object&) = delete

object(object &&other)

~object() class orcus::json::detail::init::node Node to store an initial value during document tree initialization. It’s not meant to be instantiated explicitly. A value passed from the braced initialization list is implicitly converted to an instance of this class.

Public Functions

node(double v)

node(int v)

node(bool b)

node(std::nullptr_t)

node(const char *p)

node(const std::string &s)

node(std::initializer_list vs)

node(json::array array)

node(json::object obj)

node(const node &other) = delete

node(node &&other)

~node()

node &operator=(node other) = delete

114 Chapter 2. C++ API Orcus Documentation, Release 0.16

Friends

friend class ::orcus::json::document_tree friend class ::orcus::json::node enum orcus::json::node_t Values:

enumerator unset node type is not set.

enumerator string JSON string node. A node of this type contains a string value.

enumerator number JSON number node. A node of this type contains a numeric value.

enumerator object JSON object node. A node of this type contains one or more key-value pairs.

enumerator array JSON array node. A node of this type contains one or more child nodes.

enumerator boolean_true JSON boolean node containing a value of ‘true’.

enumerator boolean_false JSON boolean node containing a value of ‘false’.

enumerator null JSON node containing a ‘null’ value.

Exceptions class orcus::json::document_error : public orcus::general_error Exception related to JSON document tree construction. Subclassed by orcus::json::key_value_error

Public Functions

document_error(const std::string &msg)

virtual ~document_error() class orcus::json::key_value_error : public orcus::json::document_error Exception that gets thrown due to ambiguity when you specify a braced list that can be interpreted either as a key-value pair inside an object or as values of an array.

2.4. Document Model 115 Orcus Documentation, Release 0.16

Public Functions

key_value_error(const std::string &msg)

virtual ~key_value_error()

2.4.3 YAML Document Tree

116 Chapter 2. C++ API CHAPTER THREE

PYTHON API

3.1 Packages

3.1.1 orcus orcus.detect_format() Detects the file format of the stream. Parameters stream – either bytes, or file object containing a byte stream. Return type orcus.FormatType Returns enum value specifying the detected file format. Example: import orcus

with open("path/to/file","rb") as f: fmt= orcus.detect_format(f)

Cell class orcus.Cell This class represents a single cell within a Sheet object. get_formula_tokens()

Return type FormulaTokens Returns an iterator object for a formula cell. Get an iterator object for formula tokens if the cell is a formula cell. This method returns None for a non-formula cell. type: orcus.CellType Attribute specifying the type of this cell. value Attribute containing the value of the cell. formula: str Attribute containing the formula string in case of a formula cell. This value will be None for a non-formula cell.

117 Orcus Documentation, Release 0.16

CellType class orcus.CellType(value) Collection of cell types stored in spreadsheet. UNKNOWN = 0 EMPTY = 1 BOOLEAN = 2 NUMERIC = 3 STRING = 4 STRING_WITH_ERROR = 5 FORMULA = 6 FORMULA_WITH_ERROR = 7

Document class orcus.Document An instance of this class represents a document model. A document consists of multiple sheet objects. sheets Read-only attribute that stores a tuple of Sheet instance objects. get_named_expressions() Get a named expressions iterator. Return type NamedExpressions Returns named expression object.

FormatType class orcus.FormatType(value) Collection of file format types currently used in orcus. UNKNOWN = 0 ODS = 1 XLSX = 2 GNUMERIC = 3 XLS_XML = 4 CSV = 5 YAML = 6 JSON = 7 XML = 8

118 Chapter 3. Python API Orcus Documentation, Release 0.16

FormulaToken class orcus.FormulaToken This class represents a single formula token value as returned from a FormulaTokens iterator. op: orcus.FormulaTokenOp Attribute specifying the opcode of the formula token. type: orcus.FormulaTokenType Attribute specifying the type of the formula token.

FormulaTokenOp class orcus.FormulaTokenOp(value) Collection of formula token opcodes. UNKNOWN = 0 SINGLE_REF = 1 RANGE_REF = 2 TABLE_REF = 3 NAMED_EXPRESSION = 4 STRING = 5 VALUE = 6 FUNCTION = 7 PLUS = 8 MINUS = 9 DIVIDE = 10 MULTIPLY = 11 EXPONENT = 12 CONCAT = 13 EQUAL = 14 NOT_EQUAL = 15 LESS = 16 GREATER = 17 LESS_EQUAL = 18 GREATER_EQUAL = 19 OPEN = 20 CLOSE = 21 SEP = 22 ERROR = 23

3.1. Packages 119 Orcus Documentation, Release 0.16

FormulaTokenType class orcus.FormulaTokenType(value) Collection of formula token types. UNKNOWN = 0 REFERENCE = 1 VALUE = 2 NAME = 3 FUNCTION = 4 OPERATOR = 5 ERROR = 6

FormulaTokens class orcus.FormulaTokens Iterator for formula tokens within a Cell object representing a formula cell. Each iteration will return a FormulaToken object.

NamedExpressions class NamedExpressions Iterator for named expressions. names: set A set of strings representing the names of the named expressions.

Sheet class orcus.Sheet An instance of this class represents a single sheet inside a document. get_rows() This function returns a row iterator object that allows you to iterate through rows in the data region. Return type SheetRows Returns row iterator object. Example: rows= sheet.get_rows()

for row in rows: print(row) # tuple of cell values

get_named_expressions() Get a named expressions iterator. Return type NamedExpressions Returns named expression object.

120 Chapter 3. Python API Orcus Documentation, Release 0.16

write() Write sheet content to specified file object. Parameters • file – writable object to write the sheet content to. • format (FormatType) – format of the output. Note that it currently only supports a subset of the formats provided by the FormatType type. name Read-only attribute that stores the name of the sheet. sheet_size Read-only dictionary object that stores the column and row sizes of the sheet with the column and row keys, respectively. data_size Read-only dictionary object that stores the column and row sizes of the data region of the sheet with the column and row keys, respectively. The data region is the smallest possible range that includes all non- empty cells in the sheet. The top-left corner of the data region is always at the top-left corner of the sheet.

SheetRows class SheetRows Iterator for rows within a Sheet object. Each iteration returns a tuple of Cell objects for the row.

3.1.2 orcus.tools bugzilla

This command allows you to download attachments from a bugzilla server that supports REST API. usage: orcus.tools.bugzilla [-h]--outdir OUTDIR [--limit LIMIT] [--offset OFFSET] [--cont] [--worker WORKER] [--cache-dir CACHE_DIR]--url URL [query [query...]]

Positional Arguments

query One or more query term to use to limit your search. Each query term must be in the form key=value. You need to quote the value string when the value string contains whitespace character i.e. key=”value with space”.

3.1. Packages 121 Orcus Documentation, Release 0.16

Named Arguments

--outdir, -o output directory for downloaded files. Downloaded files are grouped by their re- spective bug ID’s. --limit number of bugs to include in a single set of search results. Default: 50 --offset number of bugs to skip in the search results. Default: 0 --cont when specified, the search continues after the initial batch is returned, by retriev- ing the next batch of results until the entire search results are returned. The number specified by the --limit option is used as the batch size. Default: False --worker number of worker threads to use for parallel downloads of files. Default: 8 --cache-dir directory to keep downloaded bugzilla search results. The command will not send the query request to the remote server when the results are cached. You may want to delete the cache directory after you are finished. Default: .bugzilla --url base URL for bugzilla service. It must begin with the http(s):// prefix. class orcus.tools.bugzilla.BugzillaAccess(bzurl, cache_dir) Encapsulates access to a bugzilla server by using its REST API. Parameters • bzurl (str) – URL to the bugzilla server. • cache_dir (pathlib.Path) – path to the cache directory. get_bug_ids(bz_params) Get all bug ID’s for specified bugzilla query parameters. Parameters bz_params (dict) – dictionary containing all search parameters. Each search term must form a single key-value pair.

Returns (list of str): list of bug ID strings.

get_attachments(bug_id) Fetch all attachments for specified bug. file_processor

This script allows you to process a collection of spreadsheet documents. usage: orcus.tools.file_processor [-h] [--skip-file SKIP_FILE] [--processes PROCESSES] [-p PROCESSOR] [--remove-results] [--results] [--good] [--bad] [--stats] ROOT-DIR

122 Chapter 3. Python API Orcus Documentation, Release 0.16

Positional Arguments

ROOT-DIR Root directory below which to recursively find and process test files.

Named Arguments

--skip-file Optional text file containing a set of regular expressions (one per line). Filesthat match one of these rules will be skipped. --processes Number of worker processes to use. Default: 1 -p, --processor Python module file containing callback functions. --remove-results Remove all cached results files from the directory tree. Default: False --results Display the results of the processed files. Default: False --good Display the results of the successfully processed files. Default: False --bad Display the results of the unsuccessfully processed files. Default: False --stats Display statistics of the results. Use it with –results. Default: False

3.1.3 orcus.csv orcus.csv.read() Read an CSV file from a specified file path and createa orcus.Document instance object. Parameters stream – either string value, or file object containing a string stream. Return type orcus.Document Returns document instance object that stores the content of the file. Example: from orcus import csv

with open("path/to/file.csv","r") as f: doc= csv.read(f)

3.1. Packages 123 Orcus Documentation, Release 0.16

3.1.4 orcus.gnumeric orcus.gnumeric.read() Read an Gnumeric file from a specified file path and createa orcus.Document instance object. Parameters • stream – file object containing byte streams. • recalc (bool) – optional parameter specifying whether or not to recalculate the formula cells on load. Defaults to False. • error_policy (str) – optional parameter indicating what to do when encountering formula cells with invalid formula expressions. The value must be either fail or skip. Defaults to fail. Return type orcus.Document Returns document instance object that stores the content of the file. Example: from orcus import gnumeric

with open("path/to/file.gnumeric","rb") as f: doc= gnumeric.read(f, recalc= True, error_policy="fail")

3.1.5 orcus.ods orcus.ods.read() Read an Open Document Spreadsheet file from a specified file path and createa orcus.Document instance object. Parameters • stream – file object containing byte streams. • recalc (bool) – optional parameter specifying whether or not to recalculate the formula cells on load. Defaults to False. • error_policy (str) – optional parameter indicating what to do when encountering formula cells with invalid formula expressions. The value must be either fail or skip. Defaults to fail. Return type orcus.Document Returns document instance object that stores the content of the file. Example: from orcus import ods

with open("path/to/file.ods","rb") as f: doc= ods.read(f, recalc= True, error_policy="fail")

124 Chapter 3. Python API Orcus Documentation, Release 0.16

3.1.6 orcus.xlsx orcus.xlsx.read() Read an Excel file from a specified file path and createa orcus.Document instance object. The file must be of Excel 2007 XML format. Parameters • stream – file object containing byte streams. • recalc (bool) – optional parameter specifying whether or not to recalculate the formula cells on load. Defaults to False. • error_policy (str) – optional parameter indicating what to do when encountering formula cells with invalid formula expressions. The value must be either fail or skip. Defaults to fail. Return type orcus.Document Returns document instance object that stores the content of the file. Example: from orcus import xlsx

with open("path/to/file.xlsx","rb") as f: doc= xlsx.read(f, recalc= True, error_policy="fail")

3.1.7 orcus.xls_xml orcus.xls_xml.read() Read an Excel file from a specified file path and createa orcus.Document instance object. The file must be saved in the SpreadsheetML format. Parameters • stream – file object containing byte streams. • recalc (bool) – optional parameter specifying whether or not to recalculate the formula cells on load. Defaults to False. • error_policy (str) – optional parameter indicating what to do when encountering formula cells with invalid formula expressions. The value must be either fail or skip. Defaults to fail. Return type orcus.Document Returns document instance object that stores the content of the file. Example: from orcus import xls_xml

with open("path/to/file.xls_xml","rb") as f: doc= xls_xml.read(f, recalc= True, error_policy="fail")

3.1. Packages 125 Orcus Documentation, Release 0.16

126 Chapter 3. Python API CHAPTER FOUR

CLI

4.1 orcus-csv

4.1.1 Usage orcus-csv [options] FILE

The FILE must specify a path to an existing file.

4.1.2 Options

• -h [ --help ] Print this help. • -d [ --debug ] Turn on a debug mode and optionally specify a debug level in order to generate run-time debug outputs. • -r [ --recalc ] Re-calculate all formula cells after the documetn is loaded. • -e [ --error-policy ] arg (=fail) Specify whether to abort immediately when the loader fails to parse the first formula cell (‘fail’), or skip the offending cells and continue (‘skip’). • --dump-check Dump the content to stdout in a special format used for content verification in automated tests. • -o [ --output ] arg Output directory path, or output file when –dump-check option is used. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format. – json - JSON format.

127 Orcus Documentation, Release 0.16

– none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported. • --row-size arg Specify the number of maximum rows in each sheet. • --row-header arg Specify the number of header rows to repeat if the source content gets split into multiple sheets. • --split Specify whether or not to split the data into multiple sheets in case it won’t fit in a single sheet.

4.2 orcus-gnumeric

4.2.1 Usage orcus-gnumeric [options] FILE

The FILE must specify a path to an existing file.

4.2.2 Options

• -h [ --help ] Print this help. • -d [ --debug ] Turn on a debug mode and optionally specify a debug level in order to generate run-time debug outputs. • -r [ --recalc ] Re-calculate all formula cells after the documetn is loaded. • -e [ --error-policy ] arg (=fail) Specify whether to abort immediately when the loader fails to parse the first formula cell (‘fail’), or skip the offending cells and continue (‘skip’). • --dump-check Dump the content to stdout in a special format used for content verification in automated tests. • -o [ --output ] arg Output directory path, or output file when –dump-check option is used. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format.

128 Chapter 4. CLI Orcus Documentation, Release 0.16

– json - JSON format. – none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported. • --row-size arg Specify the number of maximum rows in each sheet.

4.3 orcus-json

4.3.1 Usage orcus-json [options] FILE

The FILE must specify the path to an existing file.

4.3.2 Options

• -h [ --help ] Print this help. • --mode arg Mode of operation. Select one of the following options: convert, map, map-gen, or structure. • --resolve-refs Resolve JSON references to external files. • -o [ --output ] arg Output file path. • -f [ --output-format ] arg Specify the format of output file. Supported format types are: – XML (xml) – JSON (json) – flat tree dump (check) – no output (none) • -m [ --map ] arg Path to a map file. This parameter is only used for map mode, and it is required for mapmode.

4.3. orcus-json 129 Orcus Documentation, Release 0.16

4.4 orcus-ods

4.4.1 Usage orcus-ods [options] FILE

The FILE must specify a path to an existing file.

4.4.2 Options

• -h [ --help ] Print this help. • -d [ --debug ] Turn on a debug mode and optionally specify a debug level in order to generate run-time debug outputs. • -r [ --recalc ] Re-calculate all formula cells after the documetn is loaded. • -e [ --error-policy ] arg (=fail) Specify whether to abort immediately when the loader fails to parse the first formula cell (‘fail’), or skip the offending cells and continue (‘skip’). • --dump-check Dump the content to stdout in a special format used for content verification in automated tests. • -o [ --output ] arg Output directory path, or output file when –dump-check option is used. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format. – json - JSON format. – none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported. • --row-size arg Specify the number of maximum rows in each sheet.

130 Chapter 4. CLI Orcus Documentation, Release 0.16

4.5 orcus-xls-xml

4.5.1 Usage orcus-xls-xml [options] FILE

The FILE must specify a path to an existing file.

4.5.2 Options

• -h [ --help ] Print this help. • -d [ --debug ] Turn on a debug mode and optionally specify a debug level in order to generate run-time debug outputs. • -r [ --recalc ] Re-calculate all formula cells after the documetn is loaded. • -e [ --error-policy ] arg (=fail) Specify whether to abort immediately when the loader fails to parse the first formula cell (‘fail’), or skip the offending cells and continue (‘skip’). • --dump-check Dump the content to stdout in a special format used for content verification in automated tests. • -o [ --output ] arg Output directory path, or output file when –dump-check option is used. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format. – json - JSON format. – none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported. • --row-size arg Specify the number of maximum rows in each sheet.

4.5. orcus-xls-xml 131 Orcus Documentation, Release 0.16

4.6 orcus-xlsx

4.6.1 Usage orcus-xlsx [options] FILE

The FILE must specify a path to an existing file.

4.6.2 Options

• -h [ --help ] Print this help. • -d [ --debug ] Turn on a debug mode and optionally specify a debug level in order to generate run-time debug outputs. • -r [ --recalc ] Re-calculate all formula cells after the documetn is loaded. • -e [ --error-policy ] arg (=fail) Specify whether to abort immediately when the loader fails to parse the first formula cell (‘fail’), or skip the offending cells and continue (‘skip’). • --dump-check Dump the content to stdout in a special format used for content verification in automated tests. • -o [ --output ] arg Output directory path, or output file when –dump-check option is used. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format. – json - JSON format. – none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported. • --row-size arg Specify the number of maximum rows in each sheet.

132 Chapter 4. CLI Orcus Documentation, Release 0.16

4.7 orcus-xml

4.7.1 Usage orcus-xml [OPTIONS] FILE

4.7.2 Options

• -h [ --help ] Print this help. • --mode arg Mode of operation. Select one of the following options: dump, map, map-gen, structure, or transform. • -m [ --map ] arg Path to the map file. A map file is required for all modes except for the structure mode. • -o [ --output ] arg Path to either an output directory, or an output file. • -f [ --output-format ] arg Specify the output format. Supported format types are: – check - Flat format that fully encodes document content. Suitable for automated testing. – csv - CSV format. – flat - Flat text format that displays document content ingrid. – html - HTML format. – json - JSON format. – none - No output to be generated. Maybe useful during development. – xml - This format is currently unsupported.

4.8 orcus-yaml

4.8.1 Usage orcus-yaml [options] FILE

The FILE must specify a path to an existing file.

4.7. orcus-xml 133 Orcus Documentation, Release 0.16

4.8.2 Options

• -h [ --help ] Print this help. • -o [ --output ] arg Output file path. • -f [ --output-format ] arg Specify the format of output file. Supported format types are: 1) yaml 2)json

134 Chapter 4. CLI CHAPTER FIVE

NOTES

5.1 Mapping XML to spreadsheet

In this tutorial, we will go over how to use the orcus-xml command to map an XML content into a spreadsheet document. We will be using this sample XML document throughout this tutorial.

5.1.1 Examining the structure of input XML document

First, let’s examine the general structure of this XML document: Tab Limpenny true Male Kazakh Manda Hadgraft false Female Bislama

...

It starts with the element as its root element, which contains recurring elements each of which contains multiple fields. By looking at each element structure, you can easily infer how the record content is structured. You can also run orcus-xml in structure mode in order to detect the structure of its content. Running the following command

135 Orcus Documentation, Release 0.16

orcus-xml--mode structure example.xml should generate the following output: /dataset /dataset/record[*] /dataset/record[*]/@id /dataset/record[*]/name /dataset/record[*]/name/first /dataset/record[*]/name/last /dataset/record[*]/active /dataset/record[*]/gender /dataset/record[*]/language

This output lists the paths of all encountered “leaf node” items one item per line, in order of occurrence. Each path is expressed in a XPath-like format, except for recurring “anchor” elements which are suffixed with the [*] symbols. An anchor element in this context is defined as a recurring non-leaf element that contains either an attribute oraleaf element. You can think of anchor elements as elements that define the individual record boundaries.

5.1.2 Auto-mapping the XML document

Mapping this XML document to a spreadsheet document can be done by simply running orcus-xml in map mode. You also need to specify the output format type and the output directory in order to see the content of the mapped spreadsheet document. Running the command: orcus-xml--mode map-f flat-o out example.xml will create an output file named out/range-0.txt which contains the following: --- Sheet name: range-0 rows: 21 cols:6 +------+------+------+------+------+------+ | id| first| last| active| gender| language| +------+------+------+------+------+------+ |1 [v]| Tab| Limpenny| true| Male| Kazakh| +------+------+------+------+------+------+ |2 [v]| Manda| Hadgraft| false| Female| Bislama| +------+------+------+------+------+------+ |3 [v]| Mickie| Boreham| false| Male| Swahili| +------+------+------+------+------+------+ |4 [v]| Celinka| Brookfield| false| Female| Gagauz| +------+------+------+------+------+------+ |5 [v]| Muffin| Bleas| false| Female| Hiri Motu| +------+------+------+------+------+------+ |6 [v]| Jackelyn| Crumb| false| Female| Northern Sotho| +------+------+------+------+------+------+ |7 [v]| Tessie| Hollingsbee| true| Female| Fijian| +------+------+------+------+------+------+ |8 [v]| Yank| Wernham| false| Male| Tok Pisin| +------+------+------+------+------+------+ |9 [v]| Brendan| Lello| true| Male| Fijian| (continues on next page)

136 Chapter 5. Notes Orcus Documentation, Release 0.16

(continued from previous page) +------+------+------+------+------+------+ | 10 [v]| Arabel| Rigg| false| Female| Kyrgyz| +------+------+------+------+------+------+ | 11 [v]| Carolann| McElory| false| Female| Pashto| +------+------+------+------+------+------+ | 12 [v]| Gasparo| Flack| false| Male| Telugu| +------+------+------+------+------+------+ | 13 [v]| Eolanda| Polendine| false| Female| Kashmiri| +------+------+------+------+------+------+ | 14 [v]| Brock| McCaw| false| Male| Tsonga| +------+------+------+------+------+------+ | 15 [v]| Wenda| Espinas| false| Female| Bulgarian| +------+------+------+------+------+------+ | 16 [v]| Zachary| Banane| true| Male| Persian| +------+------+------+------+------+------+ | 17 [v]| Sallyanne| Mengue| false| Female| Latvian| +------+------+------+------+------+------+ | 18 [v]| Elizabet| Hoofe| true| Female| Tswana| +------+------+------+------+------+------+ | 19 [v]| Alastair| Hutchence| true| Male| Ndebele| +------+------+------+------+------+------+ | 20 [v]| Minor| Worland| true| Male| Dutch| +------+------+------+------+------+------+

We are using the flat format type which writes the data range of a sheet in a human-readable grid output. The mapped sheet content is the result of the automatic mapping of the original XML document. In automatic mapping, all attributes and element contents that can be mapped as field values will be mapped, and the sheet name will be automatically generated. Although not applicable to this particular example, if the source XML document contains multiple mappable ranges, they will get mapped to multiple sheets, one sheet per range.

5.1.3 Custom-mapping using map file

Generating map file

Automatic-mapping should work reasonably well in many cases, but sometime you may need to customize how you map your data, and this section will go over how you could do just that. The short answer is that you will need to create a map definition file and pass it to the orcus-xml command via -m or --map option. The easiest way to go about it is to have one generated for you. Running the following command: orcus-xml--mode map-gen-o map.xml example.xml

will generate a map file map.xml which contains the mapping definition based on the auto-detected structure. The content of map.xml generated from the example XML document should look like this: (continues on next page)

5.1. Mapping XML to spreadsheet 137 Orcus Documentation, Release 0.16

(continued from previous page)

Note that since the original map file content does not include any line breaks, you may want to run it through anXML reformatting tool such as xmllint to “prettify” its content before viewing.

Map file structure

Hopefully the structure of the map file is self-explanatory, but let us go over it a little. The map element is the root element which contains one or more sheet elements and one or more range elements. The sheet elements specify how many sheets should be created in the spreadsheet model, and what their names should be via their name attributes. The ordering of the sheet elements will reflect the ordering of the sheets in the final spreadsheet document. Each range element defines one mapped range of the source XML document, and this element itself stores thetop-left position of the range in the final spreadsheet document via sheet, row and column attributes. The range element then contains one or more field elements, and one or more row-group elements. Each field element defines one field within the mapped range and the path of the value in the source XMLdocument. The path is expressed in XPath format. The ordering of the field elements reflects the ordering of the field columns in the final spreadsheet document. Each row-group element defines the path of an anchor element. For a simple XML document such asourcurrent example, you only need one row-group element. But an XML document with more complex structure may need more than one row-group element to properly map nested recurring elements.

Modifying map file

Let’s make some changes to this map file. First, the default sheet name range-0 doesn’t look very good, so we’ll change it to My Data. Also, let’s assume we aren’t really interested in the ID values or the “active” values (whatever they may mean), so we’ll drop those two fields. Additionally, since we don’t like the default field labels, which are taken literally from the names of the corresponding attributes or elements, we’ll define custom field labels. And finally, we’ll add two empty rows above the data range so that we can edit in some nice title afterward. The modified map file will look like this: (continues on next page)

138 Chapter 5. Notes Orcus Documentation, Release 0.16

(continued from previous page)

We’ll save this as map-modified.xml, and pass it to the orcus-xml command this time around like so: ./src/orcus-xml--mode map-m map-modified.xml-o out-f flat example.xml

This will output the content of the sheet to out/My Data.txt, which will look like this: --- Sheet name: My Data rows: 23 cols:4 +------+------+------+------+ ||||| +------+------+------+------+ ||||| +------+------+------+------+ | First Name| Last Name| Gender| Language| +------+------+------+------+ | Tab| Limpenny| Male| Kazakh| +------+------+------+------+ | Manda| Hadgraft| Female| Bislama| +------+------+------+------+ | Mickie| Boreham| Male| Swahili| +------+------+------+------+ | Celinka| Brookfield| Female| Gagauz| +------+------+------+------+ | Muffin| Bleas| Female| Hiri Motu| +------+------+------+------+ | Jackelyn| Crumb| Female| Northern Sotho| +------+------+------+------+ | Tessie| Hollingsbee| Female| Fijian| +------+------+------+------+ | Yank| Wernham| Male| Tok Pisin| +------+------+------+------+ | Brendan| Lello| Male| Fijian| +------+------+------+------+ | Arabel| Rigg| Female| Kyrgyz| +------+------+------+------+ | Carolann| McElory| Female| Pashto| +------+------+------+------+ | Gasparo| Flack| Male| Telugu| +------+------+------+------+ | Eolanda| Polendine| Female| Kashmiri| +------+------+------+------+ | Brock| McCaw| Male| Tsonga| +------+------+------+------+ | Wenda| Espinas| Female| Bulgarian| +------+------+------+------+ | Zachary| Banane| Male| Persian| +------+------+------+------+ | Sallyanne| Mengue| Female| Latvian| (continues on next page)

5.1. Mapping XML to spreadsheet 139 Orcus Documentation, Release 0.16

(continued from previous page) +------+------+------+------+ | Elizabet| Hoofe| Female| Tswana| +------+------+------+------+ | Alastair| Hutchence| Male| Ndebele| +------+------+------+------+ | Minor| Worland| Male| Dutch| +------+------+------+------+

The new output now only contains four fields, with custom labels at the top, and now we have two empty rowsabove just like we intended.

5.2 Mapping JSON to spreadsheet

This tutorial covers how to map JSON document to a spreadsheet document, very similar to what we covered in this tutorial where we illustrated how to map XML document to a spreadsheet document. Throughout this tutorial, we will be using this sample JSON document to illustrate how to achieve it using the orcus-json command. The structure of this tutorial will be similar to the structure of the XML mapping counterpart, since the steps are very similar.

5.2.1 Examining the structure of the input JSON document

Let’s first take a look at the sample JSON document: [ { "id":1, "name":[ "Tab", "Limpenny" ], "active": true, "gender": "Male", "language": "Kazakh" }, { "id":2, "name":[ "Manda", "Hadgraft" ], "active": false, "gender": "Female", "language": "Bislama" }, { "id":3, "name":[ "Mickie", "Boreham" (continues on next page)

140 Chapter 5. Notes Orcus Documentation, Release 0.16

(continued from previous page) ], "active": false, "gender": "Male", "language": "Swahili" },

...

This is essentially the same content as the XML sample document we used in the last tutorial but re-formatted in JSON. Let run the following command: orcus-json--mode structure example.json

to analyze the structure of this JSON document. The command will generate the following output: $array[20].object(*)['active'].value $array[20].object(*)['gender'].value $array[20].object(*)['id'].value $array[20].object(*)['language'].value $array[20].object(*)['name'].array[2].value[0,1]

This structure output resembles a variant of JSONPath but some modifications are applied. It has the following char- acteristics: • The $ symbol represents the root of the structure. • Array node takes the form of either array[N], where the value of N represents the number of elements. • Object node takes the form of object['key']. • Value node, which is always a leaf node, is represented by value except when the leaf node is an array containing values, it takes the form of value[0,1,2,...]. • The . symbols represent the node boundaries. • The (*) symbols represent recurring nodes, which can be either array or object.

5.2.2 Auto-mapping the JSON document

Let’s map this JSON document to a spreadsheet document by running: orcus-json--mode map-o out-f flat example.json

This is very similar to what we did in the XML mapping tutorial, except that the command used is orcus-json and the input file is example.json. This will create file named out/range-0.txt which contains the following: --- Sheet name: range-0 rows: 21 cols:6 +------+------+------+------+------+------+ | id| field0| field1| active| gender| language| +------+------+------+------+------+------+ |1 [v]| Tab| Limpenny| true [b]| Male| Kazakh| +------+------+------+------+------+------+ (continues on next page)

5.2. Mapping JSON to spreadsheet 141 Orcus Documentation, Release 0.16

(continued from previous page) |2 [v]| Manda| Hadgraft| false [b]| Female| Bislama| +------+------+------+------+------+------+ |3 [v]| Mickie| Boreham| false [b]| Male| Swahili| +------+------+------+------+------+------+ |4 [v]| Celinka| Brookfield| false [b]| Female| Gagauz| +------+------+------+------+------+------+ |5 [v]| Muffin| Bleas| false [b]| Female| Hiri Motu| +------+------+------+------+------+------+ |6 [v]| Jackelyn| Crumb| false [b]| Female| Northern Sotho| +------+------+------+------+------+------+ |7 [v]| Tessie| Hollingsbee| true [b]| Female| Fijian| +------+------+------+------+------+------+ |8 [v]| Yank| Wernham| false [b]| Male| Tok Pisin| +------+------+------+------+------+------+ |9 [v]| Brendan| Lello| true [b]| Male| Fijian| +------+------+------+------+------+------+ | 10 [v]| Arabel| Rigg| false [b]| Female| Kyrgyz| +------+------+------+------+------+------+ | 11 [v]| Carolann| McElory| false [b]| Female| Pashto| +------+------+------+------+------+------+ | 12 [v]| Gasparo| Flack| false [b]| Male| Telugu| +------+------+------+------+------+------+ | 13 [v]| Eolanda| Polendine| false [b]| Female| Kashmiri| +------+------+------+------+------+------+ | 14 [v]| Brock| McCaw| false [b]| Male| Tsonga| +------+------+------+------+------+------+ | 15 [v]| Wenda| Espinas| false [b]| Female| Bulgarian| +------+------+------+------+------+------+ | 16 [v]| Zachary| Banane| true [b]| Male| Persian| +------+------+------+------+------+------+ | 17 [v]| Sallyanne| Mengue| false [b]| Female| Latvian| +------+------+------+------+------+------+ | 18 [v]| Elizabet| Hoofe| true [b]| Female| Tswana| +------+------+------+------+------+------+ | 19 [v]| Alastair| Hutchence| true [b]| Male| Ndebele| +------+------+------+------+------+------+ | 20 [v]| Minor| Worland| true [b]| Male| Dutch| +------+------+------+------+------+------+

Again, this is very similar to what we saw in the XML-mapping example. Note that cell values with [v] and [b] indicate numeric and boolean values, respectively. Cells with no suffixes are string cells.

5.2.3 Custom-mapping using map file

This process is also very similar to the process we followed for XML mapping. We first auto-generate a map file, modify it, and use it to do the mapping again. Since there isn’t much difference between XML mapping and JSON mapping, let’s just go through this very quick. First step is to generate a map file for the auto-detected range by running: orcus-json--mode map-gen-o map.json example.json

142 Chapter 5. Notes Orcus Documentation, Release 0.16

which will write the mapping rules to map.json file. When you open the generated map file, you will see something like the following: { "sheets":[ "range-0" ], "ranges":[ { "sheet": "range-0", "row":0, "column":0, "row-header": true, "fields":[ { "path": "$[]['id']" }, { "path": "$[]['name'][0]" }, { "path": "$[]['name'][1]" }, { "path": "$[]['active']" }, { "path": "$[]['gender']" }, { "path": "$[]['language']" } ], "row-groups":[ { "path": "$" } ] } ] }

The structure and content of the map file should look similar to the XML counterpart, except that it is nowinJSON format, and the paths are expressed in slightly modified JSONPath bracket notation, where [] represents an array node with no position specified. Now that we have a map file, let’s modify this and use it to do the mapping once again. Just like the XMLmapping example, we are going to: • insert two blank rows above, • drop the id and active fields, • specify labels for the fields, and • change the sheet name from range-0 to My Data.

5.2. Mapping JSON to spreadsheet 143 Orcus Documentation, Release 0.16

This is what we’ve come up with: { "sheets":[ "My Data" ], "ranges":[ { "sheet": "My Data", "row":2, "column":0, "row-header": true, "fields":[ { "path": "$[]['name'][0]", "label": "First Name" }, { "path": "$[]['name'][1]", "label": "Last Name" }, { "path": "$[]['gender']", "label": "Gender" }, { "path": "$[]['language']", "label": "Language" } ], "row-groups":[ { "path": "$" } ] } ] }

We’ll save this file as map-modified.json, and pass it to the orcus-json command via --map or -m option: orcus-json--mode map-o out-f flat-m map-modified.json example.json

Let’s check the output in out/My Data.txt and see what it contains: --- Sheet name: My Data rows: 23 cols:4 +------+------+------+------+ ||||| +------+------+------+------+ ||||| +------+------+------+------+ | First Name| Last Name| Gender| Language| +------+------+------+------+ | Tab| Limpenny| Male| Kazakh| +------+------+------+------+ | Manda| Hadgraft| Female| Bislama| (continues on next page)

144 Chapter 5. Notes Orcus Documentation, Release 0.16

(continued from previous page) +------+------+------+------+ | Mickie| Boreham| Male| Swahili| +------+------+------+------+ | Celinka| Brookfield| Female| Gagauz| +------+------+------+------+ | Muffin| Bleas| Female| Hiri Motu| +------+------+------+------+ | Jackelyn| Crumb| Female| Northern Sotho| +------+------+------+------+ | Tessie| Hollingsbee| Female| Fijian| +------+------+------+------+ | Yank| Wernham| Male| Tok Pisin| +------+------+------+------+ | Brendan| Lello| Male| Fijian| +------+------+------+------+ | Arabel| Rigg| Female| Kyrgyz| +------+------+------+------+ | Carolann| McElory| Female| Pashto| +------+------+------+------+ | Gasparo| Flack| Male| Telugu| +------+------+------+------+ | Eolanda| Polendine| Female| Kashmiri| +------+------+------+------+ | Brock| McCaw| Male| Tsonga| +------+------+------+------+ | Wenda| Espinas| Female| Bulgarian| +------+------+------+------+ | Zachary| Banane| Male| Persian| +------+------+------+------+ | Sallyanne| Mengue| Female| Latvian| +------+------+------+------+ | Elizabet| Hoofe| Female| Tswana| +------+------+------+------+ | Alastair| Hutchence| Male| Ndebele| +------+------+------+------+ | Minor| Worland| Male| Dutch| +------+------+------+------+

The id and active fields are gone, the remaining fields have custom labels we specified, and there are two blankrows above. It appears that all the changes we have intended have been properly applied.

5.2. Mapping JSON to spreadsheet 145 Orcus Documentation, Release 0.16

146 Chapter 5. Notes CHAPTER SIX

INDICES AND TABLES

• genindex • search

147 Orcus Documentation, Release 0.16

148 Chapter 6. Indices and tables INDEX

B FUNCTION (orcus.FormulaTokenType attribute), 120 BOOLEAN (orcus.CellType attribute), 118 BugzillaAccess (class in orcus.tools.bugzilla), 122 G built-in function get_attachments() (or- orcus.csv.read(), 123 cus.tools.bugzilla.BugzillaAccess method), orcus.detect_format(), 117 122 orcus.Document.get_named_expressions(), get_bug_ids() (orcus.tools.bugzilla.BugzillaAccess 118 method), 122 orcus.gnumeric.read(), 124 get_formula_tokens() (orcus.Cell method), 117 orcus.ods.read(), 124 GNUMERIC (orcus.FormatType attribute), 118 orcus.Sheet.get_named_expressions(), 120 GREATER (orcus.FormulaTokenOp attribute), 119 orcus.Sheet.get_rows(), 120 GREATER_EQUAL (orcus.FormulaTokenOp attribute), 119 orcus.Sheet.write(), 120 orcus.xls_xml.read(), 125 J orcus.xlsx.read(), 125 JSON (orcus.FormatType attribute), 118 C L CellType (class in orcus), 118 LESS (orcus.FormulaTokenOp attribute), 119 CLOSE (orcus.FormulaTokenOp attribute), 119 LESS_EQUAL (orcus.FormulaTokenOp attribute), 119 CONCAT (orcus.FormulaTokenOp attribute), 119 CSV (orcus.FormatType attribute), 118 M MINUS (orcus.FormulaTokenOp attribute), 119 D MULTIPLY (orcus.FormulaTokenOp attribute), 119 data_size (orcus.Sheet attribute), 121 DIVIDE (orcus.FormulaTokenOp attribute), 119 N NAME (orcus.FormulaTokenType attribute), 120 E name (orcus.Sheet attribute), 121 EMPTY (orcus.CellType attribute), 118 NAMED_EXPRESSION (orcus.FormulaTokenOp attribute), EQUAL (orcus.FormulaTokenOp attribute), 119 119 ERROR (orcus.FormulaTokenOp attribute), 119 NamedExpressions (built-in class), 120 ERROR (orcus.FormulaTokenType attribute), 120 names (NamedExpressions attribute), 120 EXPONENT (orcus.FormulaTokenOp attribute), 119 NOT_EQUAL (orcus.FormulaTokenOp attribute), 119 NUMERIC (orcus.CellType attribute), 118 F FormatType (class in orcus), 118 O formula (orcus.Cell attribute), 117 ODS (orcus.FormatType attribute), 118 FORMULA (orcus.CellType attribute), 118 op (orcus.FormulaToken attribute), 119 FORMULA_WITH_ERROR (orcus.CellType attribute), 118 OPEN (orcus.FormulaTokenOp attribute), 119 FormulaTokenOp (class in orcus), 119 OPERATOR (orcus.FormulaTokenType attribute), 120 FormulaTokenType (class in orcus), 120 orcus.Cell (built-in class), 117 FUNCTION (orcus.FormulaTokenOp attribute), 119 orcus.csv.read()

149 Orcus Documentation, Release 0.16

built-in function, 123 orcus::css::property_value_t::hsla (C++ enu- orcus.detect_format() merator), 46 built-in function, 117 orcus::css::property_value_t::none (C++ enu- orcus.Document (built-in class), 118 merator), 46 orcus.Document.get_named_expressions() orcus::css::property_value_t::rgb (C++ enu- built-in function, 118 merator), 46 orcus.FormulaToken (built-in class), 119 orcus::css::property_value_t::rgba (C++ enu- orcus.FormulaTokens (built-in class), 120 merator), 46 orcus.gnumeric.read() orcus::css::property_value_t::string (C++ built-in function, 124 enumerator), 46 orcus.ods.read() orcus::css::property_value_t::url (C++ enu- built-in function, 124 merator), 46 orcus.Sheet (built-in class), 120 orcus::css::pseudo_class_t (C++ type), 47 orcus.Sheet.get_named_expressions() orcus::css::pseudo_element_t (C++ type), 46 built-in function, 120 orcus::css_handler (C++ class), 44 orcus.Sheet.get_rows() orcus::css_handler::at_rule_name (C++ func- built-in function, 120 tion), 44 orcus.Sheet.write() orcus::css_handler::begin_block (C++ function), built-in function, 120 45 orcus.xls_xml.read() orcus::css_handler::begin_parse (C++ function), built-in function, 125 45 orcus.xlsx.read() orcus::css_handler::begin_property (C++ func- built-in function, 125 tion), 46 orcus::cell_buffer (C++ class), 39 orcus::css_handler::combinator (C++ function), orcus::cell_buffer::append (C++ function), 40 44 orcus::cell_buffer::cell_buffer (C++ function), orcus::css_handler::end_block (C++ function), 46 40 orcus::css_handler::end_parse (C++ function), 45 orcus::cell_buffer::empty (C++ function), 40 orcus::css_handler::end_property (C++ func- orcus::cell_buffer::get (C++ function), 40 tion), 46 orcus::cell_buffer::reset (C++ function), 40 orcus::css_handler::end_selector (C++ func- orcus::cell_buffer::size (C++ function), 40 tion), 44 orcus::css::combinator_t (C++ enum), 46 orcus::css_handler::end_simple_selector orcus::css::combinator_t::descendant (C++ (C++ function), 44 enumerator), 46 orcus::css_handler::hsl (C++ function), 45 orcus::css::combinator_t::direct_child (C++ orcus::css_handler::hsla (C++ function), 45 enumerator), 46 orcus::css_handler::property_name (C++ func- orcus::css::combinator_t::next_sibling (C++ tion), 44 enumerator), 46 orcus::css_handler::rgb (C++ function), 45 orcus::css::property_function_t (C++ enum), 46 orcus::css_handler::rgba (C++ function), 45 orcus::css::property_function_t::hsl (C++ orcus::css_handler::simple_selector_class enumerator), 46 (C++ function), 44 orcus::css::property_function_t::hsla (C++ orcus::css_handler::simple_selector_id (C++ enumerator), 46 function), 44 orcus::css::property_function_t::rgb (C++ orcus::css_handler::simple_selector_pseudo_class enumerator), 46 (C++ function), 44 orcus::css::property_function_t::rgba (C++ orcus::css_handler::simple_selector_pseudo_element enumerator), 46 (C++ function), 44 orcus::css::property_function_t::unknown orcus::css_handler::simple_selector_type (C++ enumerator), 46 (C++ function), 44 orcus::css::property_function_t::url (C++ orcus::css_handler::url (C++ function), 45 enumerator), 46 orcus::css_handler::value (C++ function), 44 orcus::css::property_value_t (C++ enum), 46 orcus::css_parser (C++ class), 43 orcus::css::property_value_t::hsl (C++ enu- orcus::css_parser::css_parser (C++ function), 44 merator), 46 orcus::css_parser::handler_type (C++ type), 44

150 Index Orcus Documentation, Release 0.16

orcus::css_parser::parse (C++ function), 44 (C++ function), 58 orcus::csv::parser_config (C++ struct), 47 orcus::iface::import_filter::read_file (C++ orcus::csv::parser_config::delimiters (C++ function), 58 member), 47 orcus::iface::import_filter::read_stream orcus::csv::parser_config::parser_config (C++ function), 58 (C++ function), 47 orcus::iface::import_filter::set_config orcus::csv::parser_config::text_qualifier (C++ function), 58 (C++ member), 47 orcus::import_ods (C++ class), 91 orcus::csv::parser_config::trim_cell_value orcus::import_ods::read_styles (C++ function), (C++ member), 47 91 orcus::csv_handler (C++ class), 48 orcus::import_xlsx (C++ class), 92 orcus::csv_handler::begin_parse (C++ function), orcus::import_xlsx::read_table (C++ function), 48 93 orcus::csv_handler::begin_row (C++ function), 48 orcus::json::array (C++ class), 113 orcus::csv_handler::cell (C++ function), 48 orcus::json::array::~array (C++ function), 113 orcus::csv_handler::end_parse (C++ function), 48 orcus::json::array::array (C++ function), 113 orcus::csv_handler::end_row (C++ function), 48 orcus::json::const_node (C++ class), 110 orcus::csv_parser (C++ class), 47 orcus::json::const_node::~const_node (C++ orcus::csv_parser::csv_parser (C++ function), 47 function), 110 orcus::csv_parser::handler_type (C++ type), 47 orcus::json::const_node::back (C++ function), orcus::csv_parser::parse (C++ function), 47 111 orcus::date_time_t (C++ struct), 42 orcus::json::const_node::begin (C++ function), orcus::date_time_t::~date_time_t (C++ func- 112 tion), 43 orcus::json::const_node::child (C++ function), orcus::date_time_t::date_time_t (C++ function), 111 43 orcus::json::const_node::child_count (C++ orcus::date_time_t::day (C++ member), 43 function), 110 orcus::date_time_t::hour (C++ member), 43 orcus::json::const_node::const_node (C++ orcus::date_time_t::minute (C++ member), 43 function), 110 orcus::date_time_t::month (C++ member), 43 orcus::json::const_node::end (C++ function), 112 orcus::date_time_t::operator!= (C++ function), orcus::json::const_node::has_key (C++ func- 43 tion), 111 orcus::date_time_t::operator= (C++ function), 43 orcus::json::const_node::identity (C++ func- orcus::date_time_t::operator== (C++ function), tion), 112 43 orcus::json::const_node::key (C++ function), 110 orcus::date_time_t::second (C++ member), 43 orcus::json::const_node::keys (C++ function), orcus::date_time_t::swap (C++ function), 43 110 orcus::date_time_t::to_string (C++ function), 43 orcus::json::const_node::numeric_value (C++ orcus::date_time_t::year (C++ member), 43 function), 111 orcus::iface::document_dumper (C++ class), 58 orcus::json::const_node::operator= (C++ func- orcus::iface::document_dumper::~document_dumper tion), 112 (C++ function), 59 orcus::json::const_node::parent (C++ function), orcus::iface::document_dumper::dump (C++ 111 function), 59 orcus::json::const_node::string_value (C++ orcus::iface::document_dumper::dump_check function), 111 (C++ function), 59 orcus::json::const_node::type (C++ function), orcus::iface::import_filter (C++ class), 58 110 orcus::iface::import_filter::~import_filter orcus::json::detail::init::node (C++ class), (C++ function), 58 114 orcus::iface::import_filter::get_config orcus::json::detail::init::node::~node (C++ (C++ function), 58 function), 114 orcus::iface::import_filter::get_name (C++ orcus::json::detail::init::node::node (C++ function), 58 function), 114 orcus::iface::import_filter::import_filter orcus::json::detail::init::node::operator=

Index 151 Orcus Documentation, Release 0.16

(C++ function), 114 orcus::json::object (C++ class), 113 orcus::json::document_error (C++ class), 115 orcus::json::object::~object (C++ function), 114 orcus::json::document_error::~document_error orcus::json::object::object (C++ function), 114 (C++ function), 115 orcus::json_config (C++ struct), 109 orcus::json::document_error::document_error orcus::json_config::~json_config (C++ func- (C++ function), 115 tion), 109 orcus::json::document_tree (C++ class), 108 orcus::json_config::input_path (C++ member), orcus::json::document_tree::~document_tree 109 (C++ function), 108 orcus::json_config::json_config (C++ function), orcus::json::document_tree::document_tree 109 (C++ function), 108 orcus::json_config::output_format (C++ mem- orcus::json::document_tree::dump (C++ func- ber), 109 tion), 109 orcus::json_config::output_path (C++ member), orcus::json::document_tree::dump_xml (C++ 109 function), 109 orcus::json_config::persistent_string_values orcus::json::document_tree::get_document_root (C++ member), 110 (C++ function), 109 orcus::json_config::preserve_object_order orcus::json::document_tree::load (C++ func- (C++ member), 109 tion), 108, 109 orcus::json_config::resolve_references (C++ orcus::json::document_tree::operator= (C++ member), 110 function), 108 orcus::json_handler (C++ class), 49 orcus::json::document_tree::swap (C++ func- orcus::json_handler::begin_array (C++ func- tion), 109 tion), 49 orcus::json::key_value_error (C++ class), 115 orcus::json_handler::begin_object (C++ func- orcus::json::key_value_error::~key_value_error tion), 49 (C++ function), 116 orcus::json_handler::begin_parse (C++ func- orcus::json::key_value_error::key_value_error tion), 49 (C++ function), 116 orcus::json_handler::boolean_false (C++ func- orcus::json::node (C++ class), 112 tion), 49 orcus::json::node::~node (C++ function), 112 orcus::json_handler::boolean_true (C++ func- orcus::json::node::back (C++ function), 113 tion), 49 orcus::json::node::child (C++ function), 112 orcus::json_handler::end_array (C++ function), orcus::json::node::node (C++ function), 112 49 orcus::json::node::operator= (C++ function), 112 orcus::json_handler::end_object (C++ function), orcus::json::node::operator[] (C++ function), 49 112 orcus::json_handler::end_parse (C++ function), orcus::json::node::parent (C++ function), 113 49 orcus::json::node::push_back (C++ function), 113 orcus::json_handler::null (C++ function), 49 orcus::json::node_t (C++ enum), 115 orcus::json_handler::number (C++ function), 50 orcus::json::node_t::array (C++ enumerator), orcus::json_handler::object_key (C++ function), 115 49 orcus::json::node_t::boolean_false (C++ enu- orcus::json_handler::string (C++ function), 49 merator), 115 orcus::json_parser (C++ class), 48 orcus::json::node_t::boolean_true (C++ enu- orcus::json_parser::handler_type (C++ type), 48 merator), 115 orcus::json_parser::json_parser (C++ function), orcus::json::node_t::null (C++ enumerator), 115 48 orcus::json::node_t::number (C++ enumerator), orcus::json_parser::parse (C++ function), 48 115 orcus::length_unit_t (C++ enum), 42 orcus::json::node_t::object (C++ enumerator), orcus::length_unit_t::centimeter (C++ enumer- 115 ator), 42 orcus::json::node_t::string (C++ enumerator), orcus::length_unit_t::inch (C++ enumerator), 42 115 orcus::length_unit_t::millimeter (C++ enumer- orcus::json::node_t::unset (C++ enumerator), ator), 42 115 orcus::length_unit_t::pixel (C++ enumerator),

152 Index Orcus Documentation, Release 0.16

42 orcus::orcus_xlsx::read_stream (C++ function), orcus::length_unit_t::point (C++ enumerator), 92 42 orcus::orcus_xml (C++ class), 93 orcus::length_unit_t::twip (C++ enumerator), 42 orcus::orcus_xml::~orcus_xml (C++ function), 93 orcus::length_unit_t::unknown (C++ enumera- orcus::orcus_xml::append_field_link (C++ tor), 42 function), 94 orcus::length_unit_t::xlsx_column_digit orcus::orcus_xml::append_sheet (C++ function), (C++ enumerator), 42 94 orcus::orcus_csv (C++ class), 90 orcus::orcus_xml::commit_range (C++ function), orcus::orcus_csv::get_name (C++ function), 90 94 orcus::orcus_csv::orcus_csv (C++ function), 90 orcus::orcus_xml::detect_map_definition orcus::orcus_csv::read_file (C++ function), 90 (C++ function), 95 orcus::orcus_csv::read_stream (C++ function), 90 orcus::orcus_xml::operator= (C++ function), 93 orcus::orcus_gnumeric (C++ class), 93 orcus::orcus_xml::orcus_xml (C++ function), 93 orcus::orcus_gnumeric::~orcus_gnumeric (C++ orcus::orcus_xml::read_map_definition (C++ function), 93 function), 95 orcus::orcus_gnumeric::detect (C++ function), 93 orcus::orcus_xml::read_stream (C++ function), 94 orcus::orcus_gnumeric::get_name (C++ function), orcus::orcus_xml::set_cell_link (C++ function), 93 94 orcus::orcus_gnumeric::orcus_gnumeric (C++ orcus::orcus_xml::set_namespace_alias (C++ function), 93 function), 93 orcus::orcus_gnumeric::read_file (C++ func- orcus::orcus_xml::set_range_row_group (C++ tion), 93 function), 94 orcus::orcus_gnumeric::read_stream (C++ func- orcus::orcus_xml::start_range (C++ function), 94 tion), 93 orcus::orcus_xml::write (C++ function), 95 orcus::orcus_ods (C++ class), 91 orcus::orcus_xml::write_map_definition (C++ orcus::orcus_ods::~orcus_ods (C++ function), 91 function), 95 orcus::orcus_ods::detect (C++ function), 91 orcus::pstring (C++ class), 37 orcus::orcus_ods::get_name (C++ function), 91 orcus::pstring::clear (C++ function), 38 orcus::orcus_ods::orcus_ods (C++ function), 91 orcus::pstring::data (C++ function), 37 orcus::orcus_ods::read_file (C++ function), 91 orcus::pstring::empty (C++ function), 38 orcus::orcus_ods::read_stream (C++ function), 91 orcus::pstring::get (C++ function), 37 orcus::orcus_xls_xml (C++ class), 91 orcus::pstring::hash (C++ struct), 38 orcus::orcus_xls_xml::~orcus_xls_xml (C++ orcus::pstring::hash::operator() (C++ func- function), 91 tion), 38 orcus::orcus_xls_xml::detect (C++ function), 92 orcus::pstring::operator!= (C++ function), 37, 38 orcus::orcus_xls_xml::get_name (C++ function), orcus::pstring::operator= (C++ function), 37 92 orcus::pstring::operator== (C++ function), 37, 38 orcus::orcus_xls_xml::operator= (C++ function), orcus::pstring::operator< (C++ function), 38 91 orcus::pstring::operator[] (C++ function), 37 orcus::orcus_xls_xml::orcus_xls_xml (C++ orcus::pstring::pstring (C++ function), 37 function), 91 orcus::pstring::resize (C++ function), 38 orcus::orcus_xls_xml::read_file (C++ function), orcus::pstring::size (C++ function), 37 91 orcus::pstring::str (C++ function), 37 orcus::orcus_xls_xml::read_stream (C++ func- orcus::pstring::trim (C++ function), 38 tion), 92 orcus::sax_handler (C++ class), 51 orcus::orcus_xlsx (C++ class), 92 orcus::sax_handler::attribute (C++ function), 52 orcus::orcus_xlsx::~orcus_xlsx (C++ function), orcus::sax_handler::characters (C++ function), 92 52 orcus::orcus_xlsx::detect (C++ function), 92 orcus::sax_handler::doctype (C++ function), 51 orcus::orcus_xlsx::get_name (C++ function), 92 orcus::sax_handler::end_declaration (C++ orcus::orcus_xlsx::operator= (C++ function), 92 function), 51 orcus::orcus_xlsx::orcus_xlsx (C++ function), 92 orcus::sax_handler::end_element (C++ function), orcus::orcus_xlsx::read_file (C++ function), 92 52

Index 153 Orcus Documentation, Release 0.16

orcus::sax_handler::start_declaration (C++ member), 81 function), 51 orcus::spreadsheet::address_t::row (C++ mem- orcus::sax_handler::start_element (C++ func- ber), 81 tion), 51 orcus::spreadsheet::border_direction_t (C++ orcus::sax_ns_handler (C++ class), 52 enum), 82 orcus::sax_ns_handler::attribute (C++ func- orcus::spreadsheet::border_direction_t::bottom tion), 52 (C++ enumerator), 82 orcus::sax_ns_handler::characters (C++ func- orcus::spreadsheet::border_direction_t::diagonal tion), 52 (C++ enumerator), 82 orcus::sax_ns_handler::doctype (C++ function), orcus::spreadsheet::border_direction_t::diagonal_bl_tr 52 (C++ enumerator), 82 orcus::sax_ns_handler::end_declaration (C++ orcus::spreadsheet::border_direction_t::diagonal_tl_br function), 52 (C++ enumerator), 82 orcus::sax_ns_handler::end_element (C++ func- orcus::spreadsheet::border_direction_t::left tion), 52 (C++ enumerator), 82 orcus::sax_ns_handler::start_declaration orcus::spreadsheet::border_direction_t::right (C++ function), 52 (C++ enumerator), 82 orcus::sax_ns_handler::start_element (C++ orcus::spreadsheet::border_direction_t::top function), 52 (C++ enumerator), 82 orcus::sax_ns_parser (C++ class), 50 orcus::spreadsheet::border_direction_t::unknown orcus::sax_ns_parser::~sax_ns_parser (C++ (C++ enumerator), 82 function), 50 orcus::spreadsheet::border_style_t (C++ orcus::sax_ns_parser::handler_type (C++ type), enum), 82 50 orcus::spreadsheet::border_style_t::dash_dot orcus::sax_ns_parser::parse (C++ function), 50 (C++ enumerator), 83 orcus::sax_ns_parser::sax_ns_parser (C++ orcus::spreadsheet::border_style_t::dash_dot_dot function), 50 (C++ enumerator), 83 orcus::sax_parser (C++ class), 50 orcus::spreadsheet::border_style_t::dashed orcus::sax_parser::~sax_parser (C++ function), (C++ enumerator), 83 50 orcus::spreadsheet::border_style_t::dotted orcus::sax_parser::config_type (C++ type), 50 (C++ enumerator), 83 orcus::sax_parser::handler_type (C++ type), 50 orcus::spreadsheet::border_style_t::double_border orcus::sax_parser::parse (C++ function), 50 (C++ enumerator), 83 orcus::sax_parser::sax_parser (C++ function), 50 orcus::spreadsheet::border_style_t::double_thin orcus::sax_token_handler (C++ class), 52 (C++ enumerator), 83 orcus::sax_token_handler::characters (C++ orcus::spreadsheet::border_style_t::fine_dashed function), 53 (C++ enumerator), 83 orcus::sax_token_handler::declaration (C++ orcus::spreadsheet::border_style_t::hair function), 53 (C++ enumerator), 83 orcus::sax_token_handler::end_element (C++ orcus::spreadsheet::border_style_t::medium function), 53 (C++ enumerator), 83 orcus::sax_token_handler::start_element orcus::spreadsheet::border_style_t::medium_dash_dot (C++ function), 53 (C++ enumerator), 83 orcus::sax_token_parser (C++ class), 51 orcus::spreadsheet::border_style_t::medium_dash_dot_dot orcus::sax_token_parser::~sax_token_parser (C++ enumerator), 83 (C++ function), 51 orcus::spreadsheet::border_style_t::medium_dashed orcus::sax_token_parser::handler_type (C++ (C++ enumerator), 83 type), 51 orcus::spreadsheet::border_style_t::none orcus::sax_token_parser::parse (C++ function), (C++ enumerator), 82 51 orcus::spreadsheet::border_style_t::slant_dash_dot orcus::sax_token_parser::sax_token_parser (C++ enumerator), 83 (C++ function), 51 orcus::spreadsheet::border_style_t::solid orcus::spreadsheet::address_t (C++ struct), 81 (C++ enumerator), 82 orcus::spreadsheet::address_t::column (C++ orcus::spreadsheet::border_style_t::thick

154 Index Orcus Documentation, Release 0.16

(C++ enumerator), 83 orcus::spreadsheet::condition_operator_t::above_equal_average orcus::spreadsheet::border_style_t::thin (C++ enumerator), 88 (C++ enumerator), 83 orcus::spreadsheet::condition_operator_t::begins_with orcus::spreadsheet::border_style_t::unknown (C++ enumerator), 88 (C++ enumerator), 82 orcus::spreadsheet::condition_operator_t::below_average orcus::spreadsheet::col_t (C++ type), 80 (C++ enumerator), 88 orcus::spreadsheet::col_width_t (C++ type), 80 orcus::spreadsheet::condition_operator_t::below_equal_average orcus::spreadsheet::color_elem_t (C++ type), 80 (C++ enumerator), 88 orcus::spreadsheet::color_rgb_t (C++ struct), orcus::spreadsheet::condition_operator_t::between 81 (C++ enumerator), 87 orcus::spreadsheet::color_rgb_t::blue (C++ orcus::spreadsheet::condition_operator_t::bottom_n member), 82 (C++ enumerator), 88 orcus::spreadsheet::color_rgb_t::color_rgb_t orcus::spreadsheet::condition_operator_t::contains (C++ function), 81 (C++ enumerator), 88 orcus::spreadsheet::color_rgb_t::green (C++ orcus::spreadsheet::condition_operator_t::contains_blanks member), 82 (C++ enumerator), 88 orcus::spreadsheet::color_rgb_t::operator= orcus::spreadsheet::condition_operator_t::contains_error (C++ function), 82 (C++ enumerator), 88 orcus::spreadsheet::color_rgb_t::red (C++ orcus::spreadsheet::condition_operator_t::contains_no_error member), 82 (C++ enumerator), 88 orcus::spreadsheet::condition_date_t (C++ orcus::spreadsheet::condition_operator_t::duplicate enum), 88 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::last_7_daysorcus::spreadsheet::condition_operator_t::ends_with (C++ enumerator), 88 (C++ enumerator), 88 orcus::spreadsheet::condition_date_t::last_monthorcus::spreadsheet::condition_operator_t::equal (C++ enumerator), 89 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::last_weekorcus::spreadsheet::condition_operator_t::expression (C++ enumerator), 88 (C++ enumerator), 88 orcus::spreadsheet::condition_date_t::last_yearorcus::spreadsheet::condition_operator_t::greater (C++ enumerator), 89 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::next_monthorcus::spreadsheet::condition_operator_t::greater_equal (C++ enumerator), 89 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::next_weekorcus::spreadsheet::condition_operator_t::less (C++ enumerator), 88 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::next_yearorcus::spreadsheet::condition_operator_t::less_equal (C++ enumerator), 89 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::this_monthorcus::spreadsheet::condition_operator_t::not_between (C++ enumerator), 88 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::this_weekorcus::spreadsheet::condition_operator_t::not_contains (C++ enumerator), 88 (C++ enumerator), 88 orcus::spreadsheet::condition_date_t::this_yearorcus::spreadsheet::condition_operator_t::not_equal (C++ enumerator), 89 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::today orcus::spreadsheet::condition_operator_t::top_n (C++ enumerator), 88 (C++ enumerator), 88 orcus::spreadsheet::condition_date_t::tomorroworcus::spreadsheet::condition_operator_t::unique (C++ enumerator), 88 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::unknown orcus::spreadsheet::condition_operator_t::unknown (C++ enumerator), 88 (C++ enumerator), 87 orcus::spreadsheet::condition_date_t::yesterdayorcus::spreadsheet::condition_type_t (C++ (C++ enumerator), 88 enum), 88 orcus::spreadsheet::condition_operator_t orcus::spreadsheet::condition_type_t::automatic (C++ enum), 87 (C++ enumerator), 88 orcus::spreadsheet::condition_operator_t::above_averageorcus::spreadsheet::condition_type_t::formula (C++ enumerator), 88 (C++ enumerator), 88

Index 155 Orcus Documentation, Release 0.16

orcus::spreadsheet::condition_type_t::max tion), 96 (C++ enumerator), 88 orcus::spreadsheet::document::dump_check orcus::spreadsheet::condition_type_t::min (C++ function), 97 (C++ enumerator), 88 orcus::spreadsheet::document::dump_csv (C++ orcus::spreadsheet::condition_type_t::percent function), 97 (C++ enumerator), 88 orcus::spreadsheet::document::dump_flat orcus::spreadsheet::condition_type_t::percentile (C++ function), 96 (C++ enumerator), 88 orcus::spreadsheet::document::dump_html orcus::spreadsheet::condition_type_t::unknown (C++ function), 96 (C++ enumerator), 88 orcus::spreadsheet::document::dump_json orcus::spreadsheet::condition_type_t::value (C++ function), 97 (C++ enumerator), 88 orcus::spreadsheet::document::finalize (C++ orcus::spreadsheet::conditional_format_t function), 98 (C++ enum), 87 orcus::spreadsheet::document::get_config orcus::spreadsheet::conditional_format_t::colorscale (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_formula_grammar orcus::spreadsheet::conditional_format_t::condition (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_formula_name_resolver orcus::spreadsheet::conditional_format_t::databar (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_model_context orcus::spreadsheet::conditional_format_t::date (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_origin_date orcus::spreadsheet::conditional_format_t::formula (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_pivot_collection orcus::spreadsheet::conditional_format_t::iconset (C++ function), 96 (C++ enumerator), 87 orcus::spreadsheet::document::get_shared_strings orcus::spreadsheet::conditional_format_t::unknown (C++ function), 96 (C++ enumerator), 87 orcus::spreadsheet::document::get_sheet orcus::spreadsheet::data_table_type_t (C++ (C++ function), 96 enum), 86 orcus::spreadsheet::document::get_sheet_count orcus::spreadsheet::data_table_type_t::both (C++ function), 97 (C++ enumerator), 87 orcus::spreadsheet::document::get_sheet_index orcus::spreadsheet::data_table_type_t::column (C++ function), 97 (C++ enumerator), 86 orcus::spreadsheet::document::get_sheet_name orcus::spreadsheet::data_table_type_t::row (C++ function), 97 (C++ enumerator), 86 orcus::spreadsheet::document::get_sheet_size orcus::spreadsheet::databar_axis_t (C++ (C++ function), 97 enum), 89 orcus::spreadsheet::document::get_string_pool orcus::spreadsheet::databar_axis_t::automatic (C++ function), 97 (C++ enumerator), 89 orcus::spreadsheet::document::get_styles orcus::spreadsheet::databar_axis_t::middle (C++ function), 96 (C++ enumerator), 89 orcus::spreadsheet::document::get_table orcus::spreadsheet::databar_axis_t::none (C++ function), 98 (C++ enumerator), 89 orcus::spreadsheet::document::insert_table orcus::spreadsheet::document (C++ class), 95 (C++ function), 97 orcus::spreadsheet::document::~document orcus::spreadsheet::document::operator= (C++ function), 96 (C++ function), 96 orcus::spreadsheet::document::append_sheet orcus::spreadsheet::document::recalc_formula_cells (C++ function), 96 (C++ function), 96 orcus::spreadsheet::document::clear (C++ orcus::spreadsheet::document::set_config function), 96 (C++ function), 97 orcus::spreadsheet::document::document (C++ orcus::spreadsheet::document::set_formula_grammar function), 96 (C++ function), 97 orcus::spreadsheet::document::dump (C++ func- orcus::spreadsheet::document::set_origin_date

156 Index Orcus Documentation, Release 0.16

(C++ function), 97 (C++ enumerator), 83 orcus::spreadsheet::document::set_sheet_size orcus::spreadsheet::fill_pattern_t::medium_gray (C++ function), 97 (C++ enumerator), 84 orcus::spreadsheet::error_value_t (C++ enum), orcus::spreadsheet::fill_pattern_t::none 82 (C++ enumerator), 83 orcus::spreadsheet::error_value_t::div0 orcus::spreadsheet::fill_pattern_t::solid (C++ enumerator), 82 (C++ enumerator), 83 orcus::spreadsheet::error_value_t::na (C++ orcus::spreadsheet::formula_grammar_t (C++ enumerator), 82 enum), 84 orcus::spreadsheet::error_value_t::name orcus::spreadsheet::formula_grammar_t::gnumeric (C++ enumerator), 82 (C++ enumerator), 85 orcus::spreadsheet::error_value_t::null orcus::spreadsheet::formula_grammar_t::ods (C++ enumerator), 82 (C++ enumerator), 85 orcus::spreadsheet::error_value_t::num (C++ orcus::spreadsheet::formula_grammar_t::unknown enumerator), 82 (C++ enumerator), 84 orcus::spreadsheet::error_value_t::ref (C++ orcus::spreadsheet::formula_grammar_t::xls_xml enumerator), 82 (C++ enumerator), 84 orcus::spreadsheet::error_value_t::unknown orcus::spreadsheet::formula_grammar_t::xlsx (C++ enumerator), 82 (C++ enumerator), 85 orcus::spreadsheet::error_value_t::value orcus::spreadsheet::formula_t (C++ enum), 85 (C++ enumerator), 82 orcus::spreadsheet::formula_t::array (C++ orcus::spreadsheet::fill_pattern_t (C++ enumerator), 85 enum), 83 orcus::spreadsheet::formula_t::data_table orcus::spreadsheet::fill_pattern_t::dark_down (C++ enumerator), 85 (C++ enumerator), 83 orcus::spreadsheet::formula_t::normal (C++ orcus::spreadsheet::fill_pattern_t::dark_gray enumerator), 85 (C++ enumerator), 83 orcus::spreadsheet::formula_t::shared (C++ orcus::spreadsheet::fill_pattern_t::dark_grid enumerator), 85 (C++ enumerator), 83 orcus::spreadsheet::formula_t::unknown (C++ orcus::spreadsheet::fill_pattern_t::dark_horizontal enumerator), 85 (C++ enumerator), 83 orcus::spreadsheet::get_default_column_width orcus::spreadsheet::fill_pattern_t::dark_trellis (C++ function), 89 (C++ enumerator), 83 orcus::spreadsheet::get_default_row_height orcus::spreadsheet::fill_pattern_t::dark_up (C++ function), 89 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t (C++ orcus::spreadsheet::fill_pattern_t::dark_vertical enum), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::center orcus::spreadsheet::fill_pattern_t::gray_0625 (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::distributed orcus::spreadsheet::fill_pattern_t::gray_125 (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::filled orcus::spreadsheet::fill_pattern_t::light_down (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::justified orcus::spreadsheet::fill_pattern_t::light_gray (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::left orcus::spreadsheet::fill_pattern_t::light_grid (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::right orcus::spreadsheet::fill_pattern_t::light_horizontal (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::hor_alignment_t::unknown orcus::spreadsheet::fill_pattern_t::light_trellis (C++ enumerator), 86 (C++ enumerator), 83 orcus::spreadsheet::iface::export_factory orcus::spreadsheet::fill_pattern_t::light_up (C++ class), 80 (C++ enumerator), 83 orcus::spreadsheet::iface::export_factory::~export_factory orcus::spreadsheet::fill_pattern_t::light_vertical (C++ function), 80

Index 157 Orcus Documentation, Release 0.16 orcus::spreadsheet::iface::export_factory::get_sheetorcus::spreadsheet::iface::import_conditional_format::set_databar_axis (C++ function), 80 (C++ function), 61 orcus::spreadsheet::iface::export_sheet orcus::spreadsheet::iface::import_conditional_format::set_databar_color_negative (C++ class), 80 (C++ function), 61 orcus::spreadsheet::iface::export_sheet::~export_sheetorcus::spreadsheet::iface::import_conditional_format::set_databar_color_positive (C++ function), 80 (C++ function), 61 orcus::spreadsheet::iface::export_sheet::write_stringorcus::spreadsheet::iface::import_conditional_format::set_databar_gradient (C++ function), 80 (C++ function), 61 orcus::spreadsheet::iface::import_array_formulaorcus::spreadsheet::iface::import_conditional_format::set_date (C++ class), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::~import_array_formulaorcus::spreadsheet::iface::import_conditional_format::set_formula (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::commitorcus::spreadsheet::iface::import_conditional_format::set_icon_name (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::set_formulaorcus::spreadsheet::iface::import_conditional_format::set_iconset_reverse (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::set_rangeorcus::spreadsheet::iface::import_conditional_format::set_max_databar_length (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::set_result_boolorcus::spreadsheet::iface::import_conditional_format::set_min_databar_length (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::set_result_emptyorcus::spreadsheet::iface::import_conditional_format::set_operator (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_array_formula::set_result_stringorcus::spreadsheet::iface::import_conditional_format::set_range (C++ function), 59 (C++ function), 61, 62 orcus::spreadsheet::iface::import_array_formula::set_result_valueorcus::spreadsheet::iface::import_conditional_format::set_show_value (C++ function), 59 (C++ function), 61 orcus::spreadsheet::iface::import_auto_filter orcus::spreadsheet::iface::import_conditional_format::set_type (C++ class), 60 (C++ function), 61 orcus::spreadsheet::iface::import_auto_filter::~import_auto_filterorcus::spreadsheet::iface::import_conditional_format::set_xf_id (C++ function), 60 (C++ function), 61 orcus::spreadsheet::iface::import_auto_filter::append_column_match_valueorcus::spreadsheet::iface::import_data_table (C++ function), 60 (C++ class), 62 orcus::spreadsheet::iface::import_auto_filter::commitorcus::spreadsheet::iface::import_data_table::~import_data_table (C++ function), 60 (C++ function), 62 orcus::spreadsheet::iface::import_auto_filter::commit_columnorcus::spreadsheet::iface::import_data_table::commit (C++ function), 60 (C++ function), 62 orcus::spreadsheet::iface::import_auto_filter::set_columnorcus::spreadsheet::iface::import_data_table::set_first_reference (C++ function), 60 (C++ function), 62 orcus::spreadsheet::iface::import_auto_filter::set_rangeorcus::spreadsheet::iface::import_data_table::set_range (C++ function), 60 (C++ function), 62 orcus::spreadsheet::iface::import_conditional_formatorcus::spreadsheet::iface::import_data_table::set_second_reference (C++ class), 60 (C++ function), 62 orcus::spreadsheet::iface::import_conditional_format::~import_conditional_formatorcus::spreadsheet::iface::import_data_table::set_type (C++ function), 61 (C++ function), 62 orcus::spreadsheet::iface::import_conditional_format::commit_conditionorcus::spreadsheet::iface::import_factory (C++ function), 61 (C++ class), 62 orcus::spreadsheet::iface::import_conditional_format::commit_entryorcus::spreadsheet::iface::import_factory::~import_factory (C++ function), 61 (C++ function), 63 orcus::spreadsheet::iface::import_conditional_format::commit_formatorcus::spreadsheet::iface::import_factory::append_sheet (C++ function), 62 (C++ function), 63 orcus::spreadsheet::iface::import_conditional_format::set_colororcus::spreadsheet::iface::import_factory::create_pivot_cache_definition (C++ function), 61 (C++ function), 63 orcus::spreadsheet::iface::import_conditional_format::set_condition_typeorcus::spreadsheet::iface::import_factory::create_pivot_cache_records (C++ function), 61 (C++ function), 63

158 Index Orcus Documentation, Release 0.16 orcus::spreadsheet::iface::import_factory::finalizeorcus::spreadsheet::iface::import_named_expression::set_named_expression (C++ function), 64 (C++ function), 66 orcus::spreadsheet::iface::import_factory::get_global_settingsorcus::spreadsheet::iface::import_named_expression::set_named_range (C++ function), 63 (C++ function), 66 orcus::spreadsheet::iface::import_factory::get_named_expressionorcus::spreadsheet::iface::import_pivot_cache_definition (C++ function), 63 (C++ class), 66 orcus::spreadsheet::iface::import_factory::get_reference_resolverorcus::spreadsheet::iface::import_pivot_cache_definition::~import_pivot_cache_definition (C++ function), 63 (C++ function), 66 orcus::spreadsheet::iface::import_factory::get_shared_stringsorcus::spreadsheet::iface::import_pivot_cache_definition::commit (C++ function), 63 (C++ function), 68 orcus::spreadsheet::iface::import_factory::get_sheetorcus::spreadsheet::iface::import_pivot_cache_definition::commit_field (C++ function), 63 (C++ function), 67 orcus::spreadsheet::iface::import_factory::get_stylesorcus::spreadsheet::iface::import_pivot_cache_definition::commit_field_item (C++ function), 63 (C++ function), 68 orcus::spreadsheet::iface::import_formula orcus::spreadsheet::iface::import_pivot_cache_definition::create_field_group (C++ class), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::~import_formulaorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_count (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::commitorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_item_date_time (C++ function), 65 (C++ function), 68 orcus::spreadsheet::iface::import_formula::set_formulaorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_item_error (C++ function), 64 (C++ function), 68 orcus::spreadsheet::iface::import_formula::set_positionorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_item_numeric (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::set_result_boolorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_item_string (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::set_result_emptyorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_max_date (C++ function), 65 (C++ function), 67 orcus::spreadsheet::iface::import_formula::set_result_stringorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_max_value (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::set_result_valueorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_min_date (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_formula::set_shared_formula_indexorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_min_value (C++ function), 64 (C++ function), 67 orcus::spreadsheet::iface::import_global_settingsorcus::spreadsheet::iface::import_pivot_cache_definition::set_field_name (C++ class), 65 (C++ function), 67 orcus::spreadsheet::iface::import_global_settings::~import_global_settingsorcus::spreadsheet::iface::import_pivot_cache_definition::set_worksheet_source (C++ function), 65 (C++ function), 66, 67 orcus::spreadsheet::iface::import_global_settings::get_default_formula_grammarorcus::spreadsheet::iface::import_pivot_cache_records (C++ function), 65 (C++ class), 68 orcus::spreadsheet::iface::import_global_settings::set_character_setorcus::spreadsheet::iface::import_pivot_cache_records::~import_pivot_cache_records (C++ function), 65 (C++ function), 68 orcus::spreadsheet::iface::import_global_settings::set_default_formula_grammarorcus::spreadsheet::iface::import_pivot_cache_records::append_record_value_character (C++ function), 65 (C++ function), 68 orcus::spreadsheet::iface::import_global_settings::set_origin_dateorcus::spreadsheet::iface::import_pivot_cache_records::append_record_value_numeric (C++ function), 65 (C++ function), 68 orcus::spreadsheet::iface::import_named_expressionorcus::spreadsheet::iface::import_pivot_cache_records::append_record_value_shared_item (C++ class), 65 (C++ function), 68 orcus::spreadsheet::iface::import_named_expression::~import_named_expressionorcus::spreadsheet::iface::import_pivot_cache_records::commit (C++ function), 66 (C++ function), 68 orcus::spreadsheet::iface::import_named_expression::commitorcus::spreadsheet::iface::import_pivot_cache_records::commit_record (C++ function), 66 (C++ function), 68 orcus::spreadsheet::iface::import_named_expression::set_base_positionorcus::spreadsheet::iface::import_pivot_cache_records::set_record_count (C++ function), 66 (C++ function), 68

Index 159 Orcus Documentation, Release 0.16 orcus::spreadsheet::iface::import_reference_resolverorcus::spreadsheet::iface::import_sheet::get_sheet_view (C++ class), 69 (C++ function), 71 orcus::spreadsheet::iface::import_reference_resolver::~import_reference_resolverorcus::spreadsheet::iface::import_sheet::get_table (C++ function), 69 (C++ function), 71 orcus::spreadsheet::iface::import_reference_resolver::resolve_addressorcus::spreadsheet::iface::import_sheet::set_auto (C++ function), 69 (C++ function), 72 orcus::spreadsheet::iface::import_reference_resolver::resolve_rangeorcus::spreadsheet::iface::import_sheet::set_bool (C++ function), 69 (C++ function), 72 orcus::spreadsheet::iface::import_shared_stringsorcus::spreadsheet::iface::import_sheet::set_date_time (C++ class), 69 (C++ function), 72 orcus::spreadsheet::iface::import_shared_strings::~import_shared_stringsorcus::spreadsheet::iface::import_sheet::set_format (C++ function), 69 (C++ function), 73 orcus::spreadsheet::iface::import_shared_strings::addorcus::spreadsheet::iface::import_sheet::set_string (C++ function), 70 (C++ function), 72 orcus::spreadsheet::iface::import_shared_strings::appendorcus::spreadsheet::iface::import_sheet::set_value (C++ function), 69 (C++ function), 72 orcus::spreadsheet::iface::import_shared_strings::append_segmentorcus::spreadsheet::iface::import_sheet_properties (C++ function), 70 (C++ class), 74 orcus::spreadsheet::iface::import_shared_strings::commit_segmentsorcus::spreadsheet::iface::import_sheet_properties::~import_sheet_properties (C++ function), 71 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_boldorcus::spreadsheet::iface::import_sheet_properties::set_column_hidden (C++ function), 70 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_fontorcus::spreadsheet::iface::import_sheet_properties::set_column_width (C++ function), 70 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_font_colororcus::spreadsheet::iface::import_sheet_properties::set_merge_cell_range (C++ function), 70 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_font_nameorcus::spreadsheet::iface::import_sheet_properties::set_row_height (C++ function), 70 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_font_sizeorcus::spreadsheet::iface::import_sheet_properties::set_row_hidden (C++ function), 70 (C++ function), 74 orcus::spreadsheet::iface::import_shared_strings::set_segment_italicorcus::spreadsheet::iface::import_sheet_view (C++ function), 70 (C++ class), 74 orcus::spreadsheet::iface::import_sheet orcus::spreadsheet::iface::import_sheet_view::~import_sheet_view (C++ class), 71 (C++ function), 74 orcus::spreadsheet::iface::import_sheet::~import_sheetorcus::spreadsheet::iface::import_sheet_view::set_frozen_pane (C++ function), 71 (C++ function), 74 orcus::spreadsheet::iface::import_sheet::fill_down_cellsorcus::spreadsheet::iface::import_sheet_view::set_selected_range (C++ function), 73 (C++ function), 75 orcus::spreadsheet::iface::import_sheet::get_array_formulaorcus::spreadsheet::iface::import_sheet_view::set_sheet_active (C++ function), 72 (C++ function), 74 orcus::spreadsheet::iface::import_sheet::get_auto_filterorcus::spreadsheet::iface::import_sheet_view::set_split_pane (C++ function), 71 (C++ function), 74 orcus::spreadsheet::iface::import_sheet::get_conditional_formatorcus::spreadsheet::iface::import_styles (C++ function), 71 (C++ class), 75 orcus::spreadsheet::iface::import_sheet::get_data_tableorcus::spreadsheet::iface::import_styles::~import_styles (C++ function), 71 (C++ function), 75 orcus::spreadsheet::iface::import_sheet::get_formulaorcus::spreadsheet::iface::import_styles::commit_border (C++ function), 72 (C++ function), 77 orcus::spreadsheet::iface::import_sheet::get_named_expressionorcus::spreadsheet::iface::import_styles::commit_cell_protection (C++ function), 72 (C++ function), 77 orcus::spreadsheet::iface::import_sheet::get_sheet_propertiesorcus::spreadsheet::iface::import_styles::commit_cell_style (C++ function), 71 (C++ function), 78 orcus::spreadsheet::iface::import_sheet::get_sheet_sizeorcus::spreadsheet::iface::import_styles::commit_cell_style_xf (C++ function), 73 (C++ function), 78

160 Index Orcus Documentation, Release 0.16 orcus::spreadsheet::iface::import_styles::commit_cell_xforcus::spreadsheet::iface::import_styles::set_font_count (C++ function), 78 (C++ function), 75 orcus::spreadsheet::iface::import_styles::commit_dxforcus::spreadsheet::iface::import_styles::set_font_italic (C++ function), 78 (C++ function), 75 orcus::spreadsheet::iface::import_styles::commit_fillorcus::spreadsheet::iface::import_styles::set_font_name (C++ function), 77 (C++ function), 75 orcus::spreadsheet::iface::import_styles::commit_fontorcus::spreadsheet::iface::import_styles::set_font_size (C++ function), 76 (C++ function), 75 orcus::spreadsheet::iface::import_styles::commit_number_formatorcus::spreadsheet::iface::import_styles::set_font_underline (C++ function), 77 (C++ function), 75 orcus::spreadsheet::iface::import_styles::set_border_colororcus::spreadsheet::iface::import_styles::set_font_underline_color (C++ function), 77 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_border_countorcus::spreadsheet::iface::import_styles::set_font_underline_mode (C++ function), 77 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_border_styleorcus::spreadsheet::iface::import_styles::set_font_underline_type (C++ function), 77 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_border_widthorcus::spreadsheet::iface::import_styles::set_font_underline_width (C++ function), 77 (C++ function), 75 orcus::spreadsheet::iface::import_styles::set_cell_formula_hiddenorcus::spreadsheet::iface::import_styles::set_number_format_code (C++ function), 77 (C++ function), 77 orcus::spreadsheet::iface::import_styles::set_cell_hiddenorcus::spreadsheet::iface::import_styles::set_number_format_count (C++ function), 77 (C++ function), 77 orcus::spreadsheet::iface::import_styles::set_cell_lockedorcus::spreadsheet::iface::import_styles::set_number_format_identifier (C++ function), 77 (C++ function), 77 orcus::spreadsheet::iface::import_styles::set_cell_print_contentorcus::spreadsheet::iface::import_styles::set_strikethrough_style (C++ function), 77 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_cell_style_builtinorcus::spreadsheet::iface::import_styles::set_strikethrough_text (C++ function), 78 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_cell_style_countorcus::spreadsheet::iface::import_styles::set_strikethrough_type (C++ function), 78 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_cell_style_nameorcus::spreadsheet::iface::import_styles::set_strikethrough_width (C++ function), 78 (C++ function), 76 orcus::spreadsheet::iface::import_styles::set_cell_style_parent_nameorcus::spreadsheet::iface::import_styles::set_xf_apply_alignment (C++ function), 78 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_cell_style_xforcus::spreadsheet::iface::import_styles::set_xf_border (C++ function), 78 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_cell_style_xf_countorcus::spreadsheet::iface::import_styles::set_xf_fill (C++ function), 77 (C++ function), 77 orcus::spreadsheet::iface::import_styles::set_cell_xf_countorcus::spreadsheet::iface::import_styles::set_xf_font (C++ function), 77 (C++ function), 77 orcus::spreadsheet::iface::import_styles::set_dxf_countorcus::spreadsheet::iface::import_styles::set_xf_horizontal_alignment (C++ function), 77 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_fill_bg_colororcus::spreadsheet::iface::import_styles::set_xf_number_format (C++ function), 76 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_fill_countorcus::spreadsheet::iface::import_styles::set_xf_protection (C++ function), 76 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_fill_fg_colororcus::spreadsheet::iface::import_styles::set_xf_style_xf (C++ function), 76 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_fill_pattern_typeorcus::spreadsheet::iface::import_styles::set_xf_vertical_alignment (C++ function), 76 (C++ function), 78 orcus::spreadsheet::iface::import_styles::set_font_boldorcus::spreadsheet::iface::import_table (C++ function), 75 (C++ class), 78 orcus::spreadsheet::iface::import_styles::set_font_colororcus::spreadsheet::iface::import_table::~import_table (C++ function), 76 (C++ function), 79

Index 161 Orcus Documentation, Release 0.16

orcus::spreadsheet::iface::import_table::commitorcus::spreadsheet::import_factory::get_reference_resolver (C++ function), 79 (C++ function), 106 orcus::spreadsheet::iface::import_table::commit_columnorcus::spreadsheet::import_factory::get_shared_strings (C++ function), 79 (C++ function), 106 orcus::spreadsheet::iface::import_table::get_auto_filterorcus::spreadsheet::import_factory::get_sheet (C++ function), 79 (C++ function), 107 orcus::spreadsheet::iface::import_table::set_column_countorcus::spreadsheet::import_factory::get_styles (C++ function), 79 (C++ function), 106 orcus::spreadsheet::iface::import_table::set_column_identifierorcus::spreadsheet::import_factory::import_factory (C++ function), 79 (C++ function), 106 orcus::spreadsheet::iface::import_table::set_column_nameorcus::spreadsheet::import_factory::set_character_set (C++ function), 79 (C++ function), 107 orcus::spreadsheet::iface::import_table::set_column_totals_row_functionorcus::spreadsheet::import_factory::set_default_column_size (C++ function), 79 (C++ function), 107 orcus::spreadsheet::iface::import_table::set_column_totals_row_labelorcus::spreadsheet::import_factory::set_default_row_size (C++ function), 79 (C++ function), 107 orcus::spreadsheet::iface::import_table::set_display_nameorcus::spreadsheet::import_factory::set_formula_error_policy (C++ function), 79 (C++ function), 108 orcus::spreadsheet::iface::import_table::set_identifierorcus::spreadsheet::import_factory::set_recalc_formula_cells (C++ function), 79 (C++ function), 107 orcus::spreadsheet::iface::import_table::set_nameorcus::spreadsheet::pivot_cache (C++ class), (C++ function), 79 104 orcus::spreadsheet::iface::import_table::set_rangeorcus::spreadsheet::pivot_cache::~pivot_cache (C++ function), 79 (C++ function), 105 orcus::spreadsheet::iface::import_table::set_style_nameorcus::spreadsheet::pivot_cache::fields_type (C++ function), 79 (C++ type), 105 orcus::spreadsheet::iface::import_table::set_style_show_column_stripesorcus::spreadsheet::pivot_cache::get_all_records (C++ function), 79 (C++ function), 105 orcus::spreadsheet::iface::import_table::set_style_show_first_columnorcus::spreadsheet::pivot_cache::get_field (C++ function), 79 (C++ function), 105 orcus::spreadsheet::iface::import_table::set_style_show_last_columnorcus::spreadsheet::pivot_cache::get_field_count (C++ function), 79 (C++ function), 105 orcus::spreadsheet::iface::import_table::set_style_show_row_stripesorcus::spreadsheet::pivot_cache::get_id (C++ function), 79 (C++ function), 105 orcus::spreadsheet::iface::import_table::set_totals_row_countorcus::spreadsheet::pivot_cache::insert_fields (C++ function), 79 (C++ function), 105 orcus::spreadsheet::import_factory (C++ orcus::spreadsheet::pivot_cache::insert_records class), 106 (C++ function), 105 orcus::spreadsheet::import_factory::~import_factoryorcus::spreadsheet::pivot_cache::pivot_cache (C++ function), 106 (C++ function), 105 orcus::spreadsheet::import_factory::append_sheetorcus::spreadsheet::pivot_cache::records_type (C++ function), 107 (C++ type), 105 orcus::spreadsheet::import_factory::create_pivot_cache_definitionorcus::spreadsheet::pivot_cache_field_t (C++ function), 106 (C++ struct), 104 orcus::spreadsheet::import_factory::create_pivot_cache_recordsorcus::spreadsheet::pivot_cache_field_t::group_data (C++ function), 107 (C++ member), 104 orcus::spreadsheet::import_factory::finalize orcus::spreadsheet::pivot_cache_field_t::items (C++ function), 107 (C++ member), 104 orcus::spreadsheet::import_factory::get_character_setorcus::spreadsheet::pivot_cache_field_t::max_date (C++ function), 107 (C++ member), 104 orcus::spreadsheet::import_factory::get_global_settingsorcus::spreadsheet::pivot_cache_field_t::max_value (C++ function), 106 (C++ member), 104 orcus::spreadsheet::import_factory::get_named_expressionorcus::spreadsheet::pivot_cache_field_t::min_date (C++ function), 106 (C++ member), 104

162 Index Orcus Documentation, Release 0.16 orcus::spreadsheet::pivot_cache_field_t::min_valueorcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::start (C++ member), 104 (C++ member), 104 orcus::spreadsheet::pivot_cache_field_t::name orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::start_date (C++ member), 104 (C++ member), 104 orcus::spreadsheet::pivot_cache_field_t::pivot_cache_field_torcus::spreadsheet::pivot_cache_id_t (C++ (C++ function), 104 type), 80 orcus::spreadsheet::pivot_cache_group_by_t orcus::spreadsheet::pivot_cache_item_t (C++ (C++ enum), 89 struct), 101 orcus::spreadsheet::pivot_cache_group_by_t::daysorcus::spreadsheet::pivot_cache_item_t::boolean (C++ enumerator), 89 (C++ member), 103 orcus::spreadsheet::pivot_cache_group_by_t::hoursorcus::spreadsheet::pivot_cache_item_t::character (C++ enumerator), 89 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_by_t::minutesorcus::spreadsheet::pivot_cache_item_t::date_time (C++ enumerator), 89 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_by_t::monthsorcus::spreadsheet::pivot_cache_item_t::day (C++ enumerator), 89 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_by_t::quartersorcus::spreadsheet::pivot_cache_item_t::error (C++ enumerator), 89 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_by_t::rangeorcus::spreadsheet::pivot_cache_item_t::hour (C++ enumerator), 89 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_by_t::secondsorcus::spreadsheet::pivot_cache_item_t::item_type (C++ enumerator), 89 (C++ enum), 101 orcus::spreadsheet::pivot_cache_group_by_t::unknownorcus::spreadsheet::pivot_cache_item_t::item_type::blank (C++ enumerator), 89 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_by_t::yearsorcus::spreadsheet::pivot_cache_item_t::item_type::boolean (C++ enumerator), 89 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t orcus::spreadsheet::pivot_cache_item_t::item_type::character (C++ struct), 103 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t::base_fieldorcus::spreadsheet::pivot_cache_item_t::item_type::date_time (C++ member), 103 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t::base_to_group_indicesorcus::spreadsheet::pivot_cache_item_t::item_type::error (C++ member), 103 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t::itemsorcus::spreadsheet::pivot_cache_item_t::item_type::numeric (C++ member), 103 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t::pivot_cache_group_data_torcus::spreadsheet::pivot_cache_item_t::item_type::unknown (C++ function), 103 (C++ enumerator), 101 orcus::spreadsheet::pivot_cache_group_data_t::range_groupingorcus::spreadsheet::pivot_cache_item_t::minute (C++ member), 103 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_typeorcus::spreadsheet::pivot_cache_item_t::month (C++ struct), 103 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::auto_endorcus::spreadsheet::pivot_cache_item_t::n (C++ member), 104 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::auto_startorcus::spreadsheet::pivot_cache_item_t::numeric (C++ member), 104 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::endorcus::spreadsheet::pivot_cache_item_t::operator= (C++ member), 104 (C++ function), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::end_dateorcus::spreadsheet::pivot_cache_item_t::operator== (C++ member), 104 (C++ function), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::group_byorcus::spreadsheet::pivot_cache_item_t::operator< (C++ member), 104 (C++ function), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::intervalorcus::spreadsheet::pivot_cache_item_t::p (C++ member), 104 (C++ member), 102 orcus::spreadsheet::pivot_cache_group_data_t::range_grouping_type::range_grouping_typeorcus::spreadsheet::pivot_cache_item_t::pivot_cache_item_t (C++ function), 103 (C++ function), 102

Index 163 Orcus Documentation, Release 0.16 orcus::spreadsheet::pivot_cache_item_t::secondorcus::spreadsheet::pivot_cache_record_value_t::value_type::date_time (C++ member), 102 (C++ enumerator), 100 orcus::spreadsheet::pivot_cache_item_t::swap orcus::spreadsheet::pivot_cache_record_value_t::value_type::error (C++ function), 102 (C++ enumerator), 100 orcus::spreadsheet::pivot_cache_item_t::type orcus::spreadsheet::pivot_cache_record_value_t::value_type::numeric (C++ member), 102 (C++ enumerator), 100 orcus::spreadsheet::pivot_cache_item_t::value orcus::spreadsheet::pivot_cache_record_value_t::value_type::shared_item_index (C++ member), 103 (C++ enumerator), 100 orcus::spreadsheet::pivot_cache_item_t::year orcus::spreadsheet::pivot_cache_record_value_t::value_type::unknown (C++ member), 102 (C++ enumerator), 100 orcus::spreadsheet::pivot_cache_record_value_torcus::spreadsheet::pivot_cache_record_value_t::year (C++ struct), 100 (C++ member), 101 orcus::spreadsheet::pivot_cache_record_value_t::booleanorcus::spreadsheet::pivot_collection (C++ (C++ member), 101 class), 105 orcus::spreadsheet::pivot_cache_record_value_t::characterorcus::spreadsheet::pivot_collection::~pivot_collection (C++ member), 101 (C++ function), 105 orcus::spreadsheet::pivot_cache_record_value_t::date_timeorcus::spreadsheet::pivot_collection::get_cache (C++ member), 101 (C++ function), 106 orcus::spreadsheet::pivot_cache_record_value_t::dayorcus::spreadsheet::pivot_collection::get_cache_count (C++ member), 101 (C++ function), 106 orcus::spreadsheet::pivot_cache_record_value_t::hourorcus::spreadsheet::pivot_collection::insert_worksheet_cache (C++ member), 101 (C++ function), 105 orcus::spreadsheet::pivot_cache_record_value_t::minuteorcus::spreadsheet::pivot_collection::pivot_collection (C++ member), 101 (C++ function), 105 orcus::spreadsheet::pivot_cache_record_value_t::monthorcus::spreadsheet::range_size_t (C++ struct), (C++ member), 101 81 orcus::spreadsheet::pivot_cache_record_value_t::norcus::spreadsheet::range_size_t::columns (C++ member), 101 (C++ member), 81 orcus::spreadsheet::pivot_cache_record_value_t::numericorcus::spreadsheet::range_size_t::rows (C++ (C++ member), 101 member), 81 orcus::spreadsheet::pivot_cache_record_value_t::operator!=orcus::spreadsheet::range_t (C++ struct), 81 (C++ function), 101 orcus::spreadsheet::range_t::first (C++ mem- orcus::spreadsheet::pivot_cache_record_value_t::operator==ber), 81 (C++ function), 100 orcus::spreadsheet::range_t::last (C++ mem- orcus::spreadsheet::pivot_cache_record_value_t::p ber), 81 (C++ member), 101 orcus::spreadsheet::row_height_t (C++ type), 80 orcus::spreadsheet::pivot_cache_record_value_t::pivot_cache_record_value_torcus::spreadsheet::row_t (C++ type), 80 (C++ function), 100 orcus::spreadsheet::sheet (C++ class), 98 orcus::spreadsheet::pivot_cache_record_value_t::secondorcus::spreadsheet::sheet::~sheet (C++ func- (C++ member), 101 tion), 98 orcus::spreadsheet::pivot_cache_record_value_t::shared_item_indexorcus::spreadsheet::sheet::dump_check (C++ (C++ member), 101 function), 100 orcus::spreadsheet::pivot_cache_record_value_t::typeorcus::spreadsheet::sheet::dump_csv (C++ (C++ member), 101 function), 100 orcus::spreadsheet::pivot_cache_record_value_t::valueorcus::spreadsheet::sheet::dump_flat (C++ (C++ member), 101 function), 99 orcus::spreadsheet::pivot_cache_record_value_t::value_typeorcus::spreadsheet::sheet::dump_html (C++ (C++ enum), 100 function), 100 orcus::spreadsheet::pivot_cache_record_value_t::value_type::blankorcus::spreadsheet::sheet::dump_json (C++ (C++ enumerator), 100 function), 100 orcus::spreadsheet::pivot_cache_record_value_t::value_type::booleanorcus::spreadsheet::sheet::fill_down_cells (C++ enumerator), 100 (C++ function), 99 orcus::spreadsheet::pivot_cache_record_value_t::value_type::characterorcus::spreadsheet::sheet::finalize (C++ (C++ enumerator), 100 function), 99

164 Index Orcus Documentation, Release 0.16

orcus::spreadsheet::sheet::get_auto_filter_data (C++ enum), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_style_t::dash orcus::spreadsheet::sheet::get_cell_format (C++ enumerator), 84 (C++ function), 100 orcus::spreadsheet::strikethrough_style_t::dot_dash orcus::spreadsheet::sheet::get_col_width (C++ enumerator), 84 (C++ function), 98 orcus::spreadsheet::strikethrough_style_t::dot_dot_dash orcus::spreadsheet::sheet::get_data_range (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_style_t::dotted orcus::spreadsheet::sheet::get_date_time (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_style_t::long_dash orcus::spreadsheet::sheet::get_index (C++ (C++ enumerator), 84 function), 99 orcus::spreadsheet::strikethrough_style_t::none orcus::spreadsheet::sheet::get_merge_cell_range (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_style_t::solid orcus::spreadsheet::sheet::get_row_height (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_style_t::wave orcus::spreadsheet::sheet::get_string_identifier (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_text_t orcus::spreadsheet::sheet::is_col_hidden (C++ enum), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_text_t::cross orcus::spreadsheet::sheet::is_row_hidden (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_text_t::slash orcus::spreadsheet::sheet::set_auto (C++ (C++ enumerator), 84 function), 98 orcus::spreadsheet::strikethrough_text_t::unknown orcus::spreadsheet::sheet::set_auto_filter_data (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_type_t orcus::spreadsheet::sheet::set_bool (C++ (C++ enum), 84 function), 98 orcus::spreadsheet::strikethrough_type_t::double_type orcus::spreadsheet::sheet::set_col_hidden (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_type_t::none orcus::spreadsheet::sheet::set_col_width (C++ enumerator), 84 (C++ function), 98 orcus::spreadsheet::strikethrough_type_t::single orcus::spreadsheet::sheet::set_date_time (C++ enumerator), 84 (C++ function), 98 orcus::spreadsheet::strikethrough_type_t::unknown orcus::spreadsheet::sheet::set_format (C++ (C++ enumerator), 84 function), 98 orcus::spreadsheet::strikethrough_width_t orcus::spreadsheet::sheet::set_formula (C++ (C++ enum), 84 function), 98 orcus::spreadsheet::strikethrough_width_t::bold orcus::spreadsheet::sheet::set_grouped_formula (C++ enumerator), 84 (C++ function), 98 orcus::spreadsheet::strikethrough_width_t::medium orcus::spreadsheet::sheet::set_merge_cell_range (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_width_t::thick orcus::spreadsheet::sheet::set_row_height (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_width_t::thin orcus::spreadsheet::sheet::set_row_hidden (C++ enumerator), 84 (C++ function), 99 orcus::spreadsheet::strikethrough_width_t::unknown orcus::spreadsheet::sheet::set_string (C++ (C++ enumerator), 84 function), 98 orcus::spreadsheet::strikethrough_width_t::width_auto orcus::spreadsheet::sheet::set_value (C++ (C++ enumerator), 84 function), 98 orcus::spreadsheet::to_color_rgb (C++ func- orcus::spreadsheet::sheet::sheet (C++ func- tion), 90 tion), 98 orcus::spreadsheet::to_error_value_enum orcus::spreadsheet::sheet_t (C++ type), 80 (C++ function), 90 orcus::spreadsheet::strikethrough_style_t orcus::spreadsheet::to_pivot_cache_group_by_enum

Index 165 Orcus Documentation, Release 0.16

(C++ function), 89 orcus::spreadsheet::underline_t::long_dash orcus::spreadsheet::to_totals_row_function_enum (C++ enumerator), 85 (C++ function), 89 orcus::spreadsheet::underline_t::none (C++ orcus::spreadsheet::totals_row_function_t enumerator), 85 (C++ enum), 87 orcus::spreadsheet::underline_t::single_accounting orcus::spreadsheet::totals_row_function_t::average (C++ enumerator), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_t::single_line orcus::spreadsheet::totals_row_function_t::count (C++ enumerator), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_t::wave (C++ orcus::spreadsheet::totals_row_function_t::count_numbersenumerator), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_type_t (C++ orcus::spreadsheet::totals_row_function_t::custom enum), 86 (C++ enumerator), 87 orcus::spreadsheet::underline_type_t::double_type orcus::spreadsheet::totals_row_function_t::maximum (C++ enumerator), 86 (C++ enumerator), 87 orcus::spreadsheet::underline_type_t::none orcus::spreadsheet::totals_row_function_t::minimum (C++ enumerator), 86 (C++ enumerator), 87 orcus::spreadsheet::underline_type_t::single orcus::spreadsheet::totals_row_function_t::none (C++ enumerator), 86 (C++ enumerator), 87 orcus::spreadsheet::underline_width_t (C++ orcus::spreadsheet::totals_row_function_t::standard_deviationenum), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_width_t::bold orcus::spreadsheet::totals_row_function_t::sum (C++ enumerator), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_width_t::medium orcus::spreadsheet::totals_row_function_t::variance (C++ enumerator), 85 (C++ enumerator), 87 orcus::spreadsheet::underline_width_t::none orcus::spreadsheet::underline_attrs_t (C++ (C++ enumerator), 85 struct), 81 orcus::spreadsheet::underline_width_t::normal orcus::spreadsheet::underline_attrs_t::underline_mode (C++ enumerator), 85 (C++ member), 81 orcus::spreadsheet::underline_width_t::percent orcus::spreadsheet::underline_attrs_t::underline_style(C++ enumerator), 86 (C++ member), 81 orcus::spreadsheet::underline_width_t::positive_integer orcus::spreadsheet::underline_attrs_t::underline_type (C++ enumerator), 86 (C++ member), 81 orcus::spreadsheet::underline_width_t::positive_length orcus::spreadsheet::underline_attrs_t::underline_width(C++ enumerator), 86 (C++ member), 81 orcus::spreadsheet::underline_width_t::thick orcus::spreadsheet::underline_mode_t (C++ (C++ enumerator), 85 enum), 86 orcus::spreadsheet::underline_width_t::thin orcus::spreadsheet::underline_mode_t::continuos (C++ enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::ver_alignment_t (C++ orcus::spreadsheet::underline_mode_t::skip_white_spaceenum), 86 (C++ enumerator), 86 orcus::spreadsheet::ver_alignment_t::bottom orcus::spreadsheet::underline_t (C++ enum), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::dash (C++ orcus::spreadsheet::ver_alignment_t::distributed enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::dot_dash orcus::spreadsheet::ver_alignment_t::justified (C++ enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::dot_dot_dot_dashorcus::spreadsheet::ver_alignment_t::middle (C++ enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::dotted orcus::spreadsheet::ver_alignment_t::top (C++ enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::double_accountingorcus::spreadsheet::ver_alignment_t::unknown (C++ enumerator), 85 (C++ enumerator), 86 orcus::spreadsheet::underline_t::double_line orcus::string_pool (C++ class), 38 (C++ enumerator), 85 orcus::string_pool::~string_pool (C++ func-

166 Index Orcus Documentation, Release 0.16

tion), 38 orcus::xml_writer::pop_element (C++ function), orcus::string_pool::clear (C++ function), 39 58 orcus::string_pool::dump (C++ function), 39 orcus::xml_writer::push_element (C++ function), orcus::string_pool::get_interned_strings 57 (C++ function), 39 orcus::xml_writer::push_element_scope (C++ orcus::string_pool::intern (C++ function), 38, 39 function), 57 orcus::string_pool::merge (C++ function), 39 orcus::xml_writer::scope (C++ class), 58 orcus::string_pool::operator= (C++ function), 38 orcus::xml_writer::scope::~scope (C++ func- orcus::string_pool::size (C++ function), 39 tion), 58 orcus::string_pool::string_pool (C++ function), orcus::xml_writer::scope::operator= (C++ 38 function), 58 orcus::string_pool::swap (C++ function), 39 orcus::xml_writer::scope::scope (C++ function), orcus::tokens (C++ class), 39 58 orcus::tokens::get_token (C++ function), 39 orcus::xml_writer::xml_writer (C++ function), 57 orcus::tokens::get_token_name (C++ function), 39 orcus::xmlns_context (C++ class), 54 orcus::tokens::is_valid_token (C++ function), 39 orcus::xmlns_context::~xmlns_context (C++ orcus::tokens::tokens (C++ function), 39 function), 54 orcus::xml_name_t (C++ struct), 41 orcus::xmlns_context::dump (C++ function), 55 orcus::xml_name_t::name (C++ member), 42 orcus::xmlns_context::get (C++ function), 54 orcus::xml_name_t::ns (C++ member), 42 orcus::xmlns_context::get_alias (C++ function), orcus::xml_name_t::operator!= (C++ function), 41 55 orcus::xml_name_t::operator= (C++ function), 41 orcus::xmlns_context::get_all_namespaces orcus::xml_name_t::operator== (C++ function), 41 (C++ function), 55 orcus::xml_name_t::to_string (C++ function), 41 orcus::xmlns_context::get_index (C++ function), orcus::xml_name_t::to_string_type (C++ enum), 54 41 orcus::xmlns_context::get_short_name (C++ orcus::xml_name_t::to_string_type::use_alias function), 54 (C++ enumerator), 41 orcus::xmlns_context::operator= (C++ function), orcus::xml_name_t::to_string_type::use_short_name 54 (C++ enumerator), 41 orcus::xmlns_context::pop (C++ function), 54 orcus::xml_name_t::xml_name_t (C++ function), 41 orcus::xmlns_context::push (C++ function), 54 orcus::xml_token_attr_t (C++ struct), 42 orcus::xmlns_context::swap (C++ function), 55 orcus::xml_token_attr_t::name (C++ member), 42 orcus::xmlns_context::xmlns_context (C++ orcus::xml_token_attr_t::ns (C++ member), 42 function), 54 orcus::xml_token_attr_t::raw_name (C++ mem- orcus::xmlns_id_t (C++ type), 41 ber), 42 orcus::xmlns_repository (C++ class), 53 orcus::xml_token_attr_t::transient (C++ mem- orcus::xmlns_repository::~xmlns_repository ber), 42 (C++ function), 53 orcus::xml_token_attr_t::value (C++ member), orcus::xmlns_repository::add_predefined_values 42 (C++ function), 53 orcus::xml_token_attr_t::xml_token_attr_t orcus::xmlns_repository::create_context (C++ function), 42 (C++ function), 53 orcus::xml_token_t (C++ type), 41 orcus::xmlns_repository::get_identifier orcus::xml_writer (C++ class), 57 (C++ function), 53 orcus::xml_writer::~xml_writer (C++ function), orcus::xmlns_repository::get_short_name 57 (C++ function), 54 orcus::xml_writer::add_attribute (C++ func- orcus::xmlns_repository::xmlns_repository tion), 57 (C++ function), 53 orcus::xml_writer::add_content (C++ function), orcus::yaml_handler (C++ class), 55 57 orcus::yaml_handler::begin_document (C++ orcus::xml_writer::add_namespace (C++ func- function), 56 tion), 57 orcus::yaml_handler::begin_map (C++ function), orcus::xml_writer::operator= (C++ function), 57 56 orcus::yaml_handler::begin_map_key (C++ func-

Index 167 Orcus Documentation, Release 0.16

tion), 56 STRING (orcus.CellType attribute), 118 orcus::yaml_handler::begin_parse (C++ func- STRING (orcus.FormulaTokenOp attribute), 119 tion), 56 STRING_WITH_ERROR (orcus.CellType attribute), 118 orcus::yaml_handler::begin_sequence (C++ function), 56 T orcus::yaml_handler::boolean_false (C++ func- TABLE_REF (orcus.FormulaTokenOp attribute), 119 tion), 56 type (orcus.Cell attribute), 117 orcus::yaml_handler::boolean_true (C++ func- type (orcus.FormulaToken attribute), 119 tion), 56 orcus::yaml_handler::end_document (C++ func- U tion ), 56 UNKNOWN (orcus.CellType attribute), 118 orcus::yaml_handler::end_map C++ function ( ), 56 UNKNOWN (orcus.FormatType attribute), 118 orcus::yaml_handler::end_map_key C++ func- ( UNKNOWN (orcus.FormulaTokenOp attribute), 119 tion ), 56 UNKNOWN (orcus.FormulaTokenType attribute), 120 orcus::yaml_handler::end_parse (C++ function), 56 V orcus::yaml_handler::end_sequence (C++ func- value orcus.Cell attribute tion), 56 ( ), 117 VALUE orcus.FormulaTokenOp attribute orcus::yaml_handler::null (C++ function), 56 ( ), 119 VALUE orcus.FormulaTokenType attribute orcus::yaml_handler::number (C++ function), 56 ( ), 120 orcus::yaml_handler::string (C++ function), 56 orcus::yaml_parser (C++ class), 55 X orcus::yaml_parser::handler_type (C++ type), 55 XLS_XML (orcus.FormatType attribute), 118 orcus::yaml_parser::parse (C++ function), 55 XLSX (orcus.FormatType attribute), 118 orcus::yaml_parser::yaml_parser (C++ function), XML (orcus.FormatType attribute), 118 55 orcus::zip_archive (C++ class), 40 Y orcus::zip_archive::~zip_archive (C++ func- YAML (orcus.FormatType attribute), 118 tion), 40 orcus::zip_archive::dump_file_entry (C++ function), 40 orcus::zip_archive::get_file_entry_count (C++ function), 40 orcus::zip_archive::get_file_entry_name (C++ function), 40 orcus::zip_archive::load (C++ function), 40 orcus::zip_archive::read_file_entry (C++ function), 40 orcus::zip_archive::zip_archive (C++ function), 40 P PLUS (orcus.FormulaTokenOp attribute), 119 R RANGE_REF (orcus.FormulaTokenOp attribute), 119 REFERENCE (orcus.FormulaTokenType attribute), 120 S SEP (orcus.FormulaTokenOp attribute), 119 sheet_size (orcus.Sheet attribute), 121 SheetRows (built-in class), 121 sheets (orcus.Document attribute), 118 SINGLE_REF (orcus.FormulaTokenOp attribute), 119

168 Index