Overview of Sparta to Jira Migration1

ADTRAN MEMORANDUM CTB-2009-02 DATE: 06/05/09 SUBJECT: Overview of Sparta to Jira Migration1 FROM: C. Trevor Bowen TO: Quality Directors and Managers, Dennis McMahan COPY TO: Dan Joffe ABSTRACT Sparta is a long-standing issue tracking system, developed internally by Adtran, and used since 1998. With over 50,000 issues, Sparta has served Adtran well in managing defects. However, Adtran has recently begun migrating from Sparta to Jira. Sparta is written in Microsoft's ASP, based in Microsoft's SQL 2000, and hosted on a Microsoft Windows 2000 platform. Jira is written in Java, based on MySQL's 5.0 database, and hosted on a Linux RHEL 5.2 clone platform. Migration between the two systems requires some knowledge of both systems, especially the target system, Jira and MySQL. This memo overviews some of the essential concepts of both systems, as well as detailing fundamentals of the custom tools developed to perform the migration. This information should prove helpful as Adtran's other issue tracking systems are migrated to Jira. 1 /home/tbowen/Desktop/ctb_2009-02_sparta_jira_migration.odt Page 1 of 11 ADTRAN MEMORANDUM CTB-2009-02 INTRODUCTION Sparta was developed internally by Adtran and used since 1998 to track issues and defects. However, Adtran has recently begun to migrate from Sparta to Jira, a popular commercial issue management system. The EN division has already migrated all open field issues to Jira, as of May 2009. New issues are being entered into Jira, instead of Sparta. Adtran's CN division is in the process of defining data categorization, workflow, and view screens. Once this is finalized, and some training is provided, the CN division will migrate to Jira too. The Sparta tool uses Microsoft's SQL Server 2000 as its back-end database. User's interact with Sparta through a web-browser, Microsoft's Internet Information Services (IIS), and a front-end developed in Microsoft's ASP language. The Sparta tool resides on the Adtran server, srv-sparta. Jira is a front-end, written in Java, executed in most any Java Server, such as Tomcat. Jira can use most any SQL backend relational database management system, but Adtran uses MySQL. A traditional web- server is not essential, because Tomcat can serve traditional content as well as Java Server Pages (JSP). However, Apache is used to proxy traffic to eliminate Tomcat's annoying, port 8080 suffix. Adtran's primary Jira instance resides on the Adtran server, jira.adtran.com. Migration was complicated, because Adtran had a live Jira system with production data, and because the architecture was completely different for both systems. Ultimately, a program was developed in Perl that was used to directly read the data from Sparta's MS-SQL server, transform it appropriately, and directly write it to the MySQL database of an offline Jira development server. After the developmental database was verified, it was uploaded to the offline production server, which was afterward restarted. This memo explains the rationale, method, and complications of the path chosen to migrate the data. Adtran has many different issue tracking systems, which may one day all be migrated to Jira. If that occurs, then fundamentals of this process must be well understood and documented, as they will be used repeatedly. MIGRATION METHODS Adtran's migration to Jira is not a unique challenge. In fact, Jira's developer, Atlassian, details several different methods for migrating data to Jira: http://confluence.atlassian.com/display/JIRA/Migrating+from+Other+Issue+Trackers 1. Built-in importers – Developed specifically for importing from Bugzilla, Mantis, and FogBugz – and no other systems. 2. CSV Importer – Jira offers a built-in wizard that imports issue data packed into a CSV format. However, this only migrates issue basics, not transition history, attachments, etc.. Even comments require a special workaround. 3. Third-Party scripts for Trac – Sparta is not Trac, so this is not helpful. Page 2 of 11 ADTRAN MEMORANDUM CTB-2009-02 4. Jelly Script – This was Atlassian's recommended method, since Adtran used a custom system, and since transition history was required. 5. RPC services – Offers SOAP and XML-RPC manipulation of Jira. 6. “Your own method” – Direct manipulation of the database is discouraged and not remotely supported. Options #1 and #3 were immediately dismissed, since Sparta is not one of the supported source databases. Given Adtran's requirements for full data migration (no data loss for any migrated issue), option #2 was quickly ruled out too. Option #6 was the first thought; however, it was also abandoned at first, because it was strongly discouraged by Atlassian and forum members. The next option, #4, was a “front door” approach, Jelly scripts. This method was used as the first real attempt to migrate Sparta data. DEAD END: JELLY SCRIPT Jelly is Apache's XML based scripting “language”, where each XML tag is bound to a Java object or method. Jelly's XML attributes relate to method arguments, and tags can be nested, simulating argument inheritance or dependency. When Jira's Jelly runner is enabled, Jelly scripts can be fed into a running Jira service to automate data entry. Although well-documented: http://www.atlassian.com/software/jira/docs/latest/jelly.html Jira's Jelly implementation leaves much to be desired, which is not apparent until after much experience: 1. Parsing errors are reported merely as a parsing error. No line numbers or offender information is provided. Debug reduces to a brute-force, binary-search, trial-and-error approach, iteratively testing and reducing the input vector until the error is isolated. This is trivial for a handful of lines, but several of the Jelly scripts used to load Sparta data were over a million lines long! 2. Various symbols must be escaped by transforming them into HTML symbols. For example, an apostrophe (') cannot be merely escaped by a backslash, rather, it must be transformed into “'”. This is not difficult, but the necessary mappings are not documented and must be discovered through trial-and-error and a priori knowledge of HTML markup and XML. 3. Jelly scripts are executed as interpreted. Therefore, any interuption leaves Jira in an indeterminate state, forcing complete reload of the original database and re-execution of the corrected entire Jelly script. 4. Jelly execution is slow, very slow. Final migration of the Sparta data using Jelly scripts required over 24 hours, even though all data was local. There were no network or VPN dependencies. 5. Jelly execution consumes a lot of memory. Over 6 GB of RAM was required and allocated to Tomcat's JVM to execute a single 260 MB Jelly script. 6. Jira's Jelly implementation is incomplete. Many functions required manipulation of the SQL Page 3 of 11 ADTRAN MEMORANDUM CTB-2009-02 database during Jelly execution to adjust timestamps, issue numbers, and user information, which the Jelly tag failed to request and store. 7. Jira's Jelly implementation is horribly broke! The workflow transition tag not only failed to update the timestamps correctly, it also failed to leave any trace in the workflow history log, requiring excessing SQL post-processing to rectify, which yielded the entire process practically useless. Although an interesting concept, the Jelly script approach proved to be a dead end, because of performance and capability issues. Several Perl programs were developed to read the Sparta database directly from the server, srv-sparta, and write Jelly scripts to local disk, which could be used to import the Sparta data into Jira. These scripts are available for reference here, executed in this order: [email protected]:/home/tbowen/bin/sync_sparta_users.pl [email protected]:/home/tbowen/bin/sync_sparta_issues.pl [email protected]:/home/tbowen/bin/sync_sparta_history.pl [email protected]:/home/tbowen/bin/sync_sparta_attachments.pl Later, these scripts were combined into a single Perl program, although this effort was not finalized: [email protected]:/home/tbowen/bin/sync_sparta_all.pl Ultimately, these programs were abandoned, because direct SQL manipulation was increasingly added to the Jelly scripts to compensate for incomplete or broken Jelly tags. Eventually, it became more practical to transport the entire database via SQL statements, transforming the data as necessary using an intermediate Perl program. Because of the time wasted and trust lost investing in Atlassian's Jelly implementation, which was recommended before the RPC approach, the RPC services approach was never seriously investigated. DIRECT SQL MIGRATION VIA CUSTOM PERL PROGRAM Although unsupported and discouraged by Atlassian, the Sparta database was migrated by directly querying the Sparta database on srv-sparta, locally transforming the data, and directly writing to a local development server via SQL statements. The Perl programming language was initially designed for parsing and transforming large amounts of data, and Perl offers powerful modules for interfacing with various databases; therefore, Perl was chosen to develop the programmable step, #3, in the following procedure for migrating Sparta to Jira: 1. Prepare new Jira project via Jira's web GUI. 2. Convert Jira database to UTF-8 encoding. 3. Read, transform, and import Sparta data into development server. 4. Transport database from development to production server. These fundamental steps are expanded below. Page 4 of 11 ADTRAN MEMORANDUM CTB-2009-02 1) PREPARE JIRA PROJECT A new Jira project was created to house all of the issues ported from Sparta into Jira. The project was designed to accommodate read-only access, so users could search, read, and link against the full Sparta database, after it was migrated to Jira. Although similar states, resolutions, fields and other issue data types existed in the Jira system, several new data types were created, so issue data would match as closely as possible between Sparta and Jira.

Load more