<<

AP WEBFEEDS, MANAGER AND AGENT 3.X Frequently Asked Questions

Back to questions

WEBFEEDS, MANAGER AND AGENT General What is WebFeeds? What is WebFeeds Manager? Should I use WebFeeds alone or with WFM? What do I need to capture feeds if I do not use WFM? What is the minimum recommended bandwidth? What are entitlements? Can I consolidate delivery channels as a result of this system? What user name and password can I use to connect to the WebFeeds server? Why am I only seeing three days’ worth of content? Feeds and What markup language is used in WebFeeds? What is the AP format? What is NITF? What version of NITF is used in WebFeeds? What is the difference between the AP ATOM feed and the NITF document? Does the NITF document contain any AP metadata tags (for example, dateline and byline)? I have an old production system that doesn’t understand XML or ATOM. What do I do? What is the hNews format? What are AP Top Headlines? Do you provide possible values of the classification metadata elements? ‘Duplicates’ and Vital Tags What metadata tags are vital for parsing the AP ATOM feed? Why do I get ‘duplicate’ stories? How can I filter out duplicate stories from multiple products in separate feeds? Content Files Which media types are supported in WebFeeds? What is a content file? How are content files referenced in an AP ATOM feed? Why are the links to content files from the feed not working anymore? In what formats are content files available? What are suggested media? Where can I find metadata for suggested photos and other media?

WEBFEEDS ONLY Does WebFeeds support Business-to-Consumer (B2C) syndication? I am writing my own client code to download a feed. What kind of authentication should I use? Are there any client implementation requirements? Where do I get my list of entitlements?

January 25, 2019 Page 1 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Can I use package IDs to request a feed? What syntax and parameters do I use to request a feed? I am creating a feed and receiving a message that authentication failed. What does this mean? Why is the number of items that I get in the returned feed different from the maxItems value? How can I make sure that I am getting only new content with each feed request? I am interested in specific video renditions. How can I ensure that I receive them? Why am I not getting suggested photos in the ATOM syndication feed (format=6)?

WEBFEEDS MANAGER AND AGENT ONLY What are the system requirements for WebFeeds Manager? Are there any special requirements for Port Integration? Is Power PC supported? How do I get WebFeeds Agent? What are the minimal settings that I must configure to start downloading content? Can I change my password? How do I filter out duplicate content using WebFeeds Agent? How can I post-process individual content files that are downloaded by the agent? Can I use multiple instances of WebFeeds Agent? What are the naming conventions for files downloaded by the agent? What are the SFF files that the agent downloads? Why can’t I open, move, copy, rename or delete some of my content files in Windows Explorer? The agent stops running on a remote server when I close the terminal window. What can I do? How can I run the agent using a proxy server?

TECHNICAL SUPPORT AND DOCUMENTATION How do I contact AP Customer Support? How can I comment on the documentation?

WEBFEEDS, MANAGER AND AGENT General

What is WebFeeds? AP WebFeeds is part of the new generation of AP’s content distribution platform, which provides your organization with greater control over when and what content is received from AP. WebFeeds delivers the last three days’ worth of news content to your organization via an HTTP feed for easy integration with your service or application. Back to questions

What is WebFeeds Manager? AP WebFeeds Manager (WFM) automatically downloads news content available through WebFeeds. You can download content for your entitlements, including AP products and packages to which your organization is entitled and/or saved searches created in AP Exchange. Content files and metadata are saved to specified folders watched by your content management system (CMS). You can also post-process content using custom scripts; for example, to alert your CMS that new content has arrived. In addition, serial port integration allows you to send ANPA and/or IPTC files to your front-end system or CMS using RS-232 serial transfer.

January 25, 2019 Page 2 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

The WebFeeds Manager components are: − WFM configuration portal. A Web-based application for managing configuration profiles, linking profiles to agents and monitoring ingestion progress. A configuration profile specifies the settings for downloading content, such as entitlements for which you wish to download content and other configuration options; for example, file saving settings and logging preferences. − WebFeeds Agent (WFA). A Java application that downloads content using the settings from the linked configuration profile. It is also used for starting and stopping downloads and viewing status and log messages. Back to questions

Should I use WebFeeds alone or with WFM? WFM is an out-of-the- product that offers a wide variety of configuration options to meet your needs and allows you to start downloading content right away. Using WebFeeds directly is recommended if you are looking for tighter integration with your own CMS and/or content ingestion system; however, this requires development resources to implement a solution for capturing and processing a feed. Back to questions

What do I need to capture feeds if I do not use WFM? Capturing and processing a feed involves making the feed request, retrieving the feed items and saving them along with any suggested media to your file system, database or CMS/production system. You must write a program that makes Web requests to the WebFeeds server and processes the returned XML responses. Successful programs can be written in almost any language that is capable of making Web requests, such as Perl, PHP, Python, Java or any .Net language. The program must use basic Web requests and responses and must be capable of parsing XML. For more information, refer to the AP WebFeeds Developer’s Guide. Back to questions

What is the minimum recommended bandwidth? The minimum recommended bandwidth is 2 Mbps for feeds with text and photos and 512 Kbps for feeds with text only. Back to questions

What are entitlements? Entitlements are AP products and packages to which your organization is entitled and saved searches created in AP Exchange. A product is a standard AP news service or report, which is defined by a name, a product ID number and a description; for example, “AP Online National News.” A package is a bundle of products. A saved search is a search set up in the AP Exchange website that can be converted into a WebFeed, very similar to a personalized news wire. Back to questions

Can I consolidate delivery channels as a result of this system? Yes, all of the AP products to which you are entitled are available via WebFeeds or WFM. Delivery of stories in ANPA and IPTC 7901 formats is also supported. Back to questions

What user name and password can I use to connect to the WebFeeds server? Your WebFeeds user name and password, which you can also use in WFM, are provided in the Welcome Letter. AP Exchange and AP Images users may use their respective user names and passwords to get authenticated by the WebFeeds system. If you plan to download content from AP Exchange saved searches, use the user name and password of the Site Administrator account to access all saved searches for your organization. AP Exchange users who do not have Site Administrator privileges can receive feeds only for the saved searches that they created or copied from their organization’s shared searches. Back to questions

January 25, 2019 Page 3 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Why am I only seeing three days’ worth of content? WebFeeds provides access to the content from the last three days only. This is a big increase over satellite, which offers no ability to go back in time to retrieve content. Back to questions

Feeds and Metadata

What markup language is used in WebFeeds? The content is supplied as XML—the feeds are available in the AP ATOM format, and stories are provided in the NITF format. Optionally, you can get stories in ANPA, IPTC 7901 or hNews formats. Back to questions

What is the AP ATOM format? AP ATOM is the ATOM 1.0 format with additional proprietary metadata inserted by AP and embedded links to news stories, photos, graphics, audio and video files. The AP ATOM feed may also include external links to the AP Online Video Network (OVN) and third-party websites. Back to questions

What is NITF? NITF (News Industry Text Format) is an XML-compliant markup language for news copy, press releases, wire services, newspapers, broadcasters and Web-based news organizations. Back to questions

What version of NITF is used in WebFeeds? NITF version 3.4. Back to questions

What is the difference between the AP ATOM feed and the NITF document? The AP ATOM feed is a list of content items (such as stories, photos, graphics, audio and video) that contains references to the actual content files (in the form of a URL string) and AP proprietary metadata, such as news management information and content metadata tags. The NITF document contains the actual text story. Back to questions

Does the NITF document contain any AP metadata tags (for example, dateline and byline)? NITF documents do not contain the full set of AP metadata because it is included in the AP ATOM feed. Back to questions

I have an old production system that doesn’t understand XML or ATOM. What do I do? WebFeeds and WFM allow you to download your stories in ANPA or IPTC 7901 format for use with some older publication systems. Back to questions

What is the hNews format? hNews is a based on XHTML elements. For more information, visit http://microformats.org. Back to questions

What are AP Top Headlines? AP Top Headlines are collections of AP’s top news stories that are filed by AP editors multiple times during the day, many times with the same stories (top stories don’t change all that often from one hour to the next). In the AP ATOM feed, entries for AP Top Headline stories are preceded by a Top Headline parent entry that identifies all of these stories. Each of the individual story entries contains metadata tags that identify the story as part of AP Top Headlines. Individual Top Headline stories do not appear in chronological order in the feed.

January 25, 2019 Page 4 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Example of AP Top Headlines feed structure (only two stories are shown for brevity):

Example of an AP Top Headlines feed: - urn:publicid:ap.org:fe791c47ae224a60983c17fcef40d2ee 1 ... - ... - - - ... - urn:publicid:ap.org:37ccf119a68644cdb01168a278db1d20 ... - urn:publicid:ap.org:683503204eee4b21a10efe39796ce817 2 ... + xml" length="238"> - ... - - ... - urn:publicid:ap.org:b677f7be2bb24683baf8cf477e062935 ... - urn:publicid:ap.org:4a4981999cb54c54bf101cebd78d3911 3 ... + - ...

January 25, 2019 Page 5 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

- 3 - ... - urn:publicid:ap.org:2d42cd5717044b9e8572bfe56d460d63 ...

Legend: 37ccf119a68644cdb01168a278db1d20 Top Headline parent’s Management ID fe791c47ae224a60983c17fcef40d2ee Top Headline parent’s entry ID b677f7be2bb24683baf8cf477e062935 The first story’s Management ID 683503204eee4b21a10efe39796ce817 The first story’s entry ID 2d42cd5717044b9e8572bfe56d460d63 The second story’s Management ID 4a4981999cb54c54bf101cebd78d3911 The second story’s entry ID

For more information, see “AP Top Headlines in the AP ATOM Feed” in the AP WebFeeds Developer’s Guide. Back to questions

Do you provide possible values of the classification metadata elements? Yes, these values are available in the AP Classification Metadata Reference Guide. Back to questions

‘Duplicates’ and Vital Tags

What metadata tags are vital for parsing the AP ATOM feed? − ManagementId. A unique identifier for the chain of news stories that comprise a content item. Remains the same for the initial version and each subsequent revision because it points to the chain of revised articles, and not an individual revision. The is unique across all AP products—if an article appears in multiple products, it has the same in all of these products. Example: urn:publicid:ap.org:bcdd90bfdbfe4d65a12f01d3620c46f8

− ManagementSequenceNumber. A natural number from 0 to the number of the article revisions: 0 for the initial version, 1 for the first revision, 2 for the second revision and so forth. The higher the number, the more recent the article’s revision. This value can be used for tracking revisions in conjunction with . Example: 0

− id (also known as entry ID or Revision ID). A unique identifier for each individual revision of a content item; for example, a text article. If an article is written and rewritten several times during a news cycle as new information is uncovered, separate feed entries are created for the initial version and each rewrite. Example: - - urn:publicid:ap.org:53e8a08557cb40a4b8a85089c14eea6f

This ID is essential for identifying ‘duplicate’ entries; for example, from Top Headline products or from multiple products in separate feeds. For more information, see “Why do I get ‘duplicate’ stories?” on page 8 and “How can I filter out duplicate stories from multiple products in separate feeds?” on page 8 of this FAQ. − ContentId. A unique ID that is assigned to a news item component. This ID can be used to correlate downloaded news item components with their specific metadata in a feed entry. NEWS ITEM COMPONENTS Text NITF and optionally ANPA, IPTC and hNews versions of the story Photos, graphics Caption and different versions of the image (main, preview and thumbnail)

January 25, 2019 Page 6 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

MEDIA TYPE NEWS ITEM COMPONENTS Audio Audio files in various formats Video Video files in a variety of formats and at different quality levels Example: - -

− updated (entry’s date updated), published (entry’s date filed) and apcm:FirstCreated (date first created). − The entry’s updated tag contains the date and time when the news item’s revision became available for syndication. − The entry’s published tag contains the date and time when the current revision of the item was filed. − The apcm:FirstCreated tag contains the date and time when the content for the current revision of the publication was created. For example, a photo taken at a Sunday night game and filed on Monday morning would carry the apcm:FirstCreated value from Sunday and the updated and published values from Monday: - urn:publicid:ap.org:90dd993b9ef4360b3aa7ac47253ca19e2015-10-19T09:51:57.230Z 2015-10-19T09:51:40Z2015-10-18T19:30:05Z

Note: For news items with multiple revisions, such as news stories, the value is the date and time when the current revision was filed. − SequenceNumber. A unique sequential number that identifies each feed entry that is counted in the number of items returned by the WebFeeds server. All entries in all products except for Top Headline products are assigned SequenceNumber IDs. For Top Headline products, parent entries are assigned SequenceNumber IDs and are counted in the number of items returned by the WebFeeds server, but child story entries are not counted and do not have a SequenceNumber ID. For more information, see “What are AP Top Headlines?” on page 4 of this FAQ. Example of a valid SequenceNumber ID: Example of a SequenceNumber for a Top Headline story entry:

Important: To receive only new content with each feed request, you must use the value in conjunction with the value from the previously returned feed. To prevent missing content, specify the sortOrder=chronological parameter in the request. For more information, see “How can I make sure that I am getting only new content with each feed request?” on page 13 of this FAQ. − media-id. The Revision ID of suggested media (photos, graphics, audio or video). It is available in the NITF- formatted stories that can be either referenced or included in the AP ATOM feed. If you receive other AP ATOM feeds for images, audio and video, you can use the Revision IDs of suggested media to retrieve their AP metadata from the corresponding entries of the other feeds. For more information, see “Where can I find metadata for suggested photos and other media?” on page 9 of this FAQ. Back to questions

January 25, 2019 Page 7 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Why do I get ‘duplicate’ stories? A story may appear more than once in the same feed only when the feed contains one or more Top Headline products. AP distributes a variety of Top Headline products, and they are filed multiple times throughout the day, often with the same stories. For example, if you are ingesting Top US Stories, Top International Stories and Top Financial Stories, an entry about the $700 billion bailout would likely be included in all three of them multiple times throughout the day (top stories don’t change all that often from one hour to the next). If you are using WebFeeds Manager, a new feed is generated every time WebFeeds Agent polls the WebFeeds server (by default, every five minutes). However, even if you are ingesting just one Top Headline product, you may still get multiple instances of the same story in the downloaded feed file. This may happen anytime the agent polls the feed server, and the returned content contains more than one Top Headline collection. For more information, see “What are AP Top Headlines?” on page 4. WebFeeds Manager allows you to filter out duplicate content. For more information, see “How do I filter out duplicate content using WebFeeds Agent?” on page 17. Back to questions

How can I filter out duplicate stories from multiple products in separate feeds? If you are getting content for multiple products in the same feed, duplicate entries are automatically deleted. To filter out duplicates from multiple products in separate feeds, you must be familiar with the AP ATOM elements related to a news item’s revision history (, and entry ). For more information, see “What metadata tags are vital for parsing the AP ATOM feed?” on page 6 of this FAQ. The following illustration shows an example of a news item’s revision history: − The “WebFeeds Delivery” part of the illustration shows multiple story versions that are delivered as individual feed entries linked by their Management IDs. These stories can be delivered multiple times via different products specified in separate feed requests. − The “AP Content Delivered” part shows how you can correctly group content according to its revision history and filter out duplicates from multiple products in separate feeds.

Back to questions

January 25, 2019 Page 8 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Content Files

Which media types are supported in WebFeeds? WebFeeds supports text, photos, graphics, audio and video. Back to questions

What is a content file? A content file is a unique news story, photo, graphic, audio or video file. Back to questions

How are content files referenced in an AP ATOM feed? Content files are referenced as links within the AP ATOM feed. Back to questions

Why are the links to content files from the feed not working anymore? For security reasons, links to content files are available only for a certain period of time after you get the feed and then expire. Back to questions

In what formats are content files available? Content files are available in the following formats: − Stories: Always provided in NITF XML and optionally in ANPA, IPTC 7901 and hNews. − Photos: NITF XML (caption); JPEG (main, preview and thumbnail). The main image is a high- resolution version; the preview is a low-resolution version displayed in Web-based applications; the thumbnail is a small version. − Graphics: NITF XML (caption); JPEG (preview and thumbnail); PDF, Illustrator or Freehand (main). − Audio: MP3, MPG, RA, WAV. − Video: NITF XML (caption), TXT (script and/or shotlist), JPEG (the preview image and thumbnails of different sizes); Flash, Windows Media, QuickTime, H.264 MP4, MPEG-2, and 3GP (at various bit rates and resolution).

Back to questions

What are suggested media? Suggested media are photos, graphics, audio and video that are associated with a text story by an AP editor. Links to suggested media appear in NITF-formatted text stories linked to the AP ATOM feed. Back to questions

Where can I find metadata for suggested photos and other media? For suggested media (images, audio and video) referenced in NITF stories, AP metadata is not available in the feed or in the NITF document. However, if you receive other AP ATOM feeds for images, audio and video, you can use the Revision IDs of suggested media to retrieve their AP metadata from the corresponding entries of the other feeds. In the following example, an NITF story is embedded in the AP ATOM feed and includes a reference to a suggested photo. The AP metadata for the photo is not included in the feed or NITF story, but you can use the Revision ID of a suggested photo to retrieve its AP metadata from another AP ATOM photo feed:

January 25, 2019 Page 9 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

In the NITF-formatted story, the Revision ID is located in the value attribute of the element (shown in green in the example below): -

In the photo feed, the Revision ID is the entry ID (shown in green below): - urn:publicid:ap.org:f7db680f94a9495cb982ea978328071d SWEDEN FIGURE SKATING WORLD CHAMPIONSHIPS 2008-03-18T15:51:43.947Z … - FM,JW Francois Mori ASSOCIATED PRESS AP ----- GOT134 SWEDEN FIGURE SKATING WORLD CHAMPIONSHIPS 080318010719 Photo - -

WEBFEEDS ONLY

Does WebFeeds support Business-to-Consumer (B2C) syndication? Unlike RSS feeds, WebFeeds does not support B2C syndication. The WebFeeds system is designed for Business-to-Business (B2B) syndication via your content management system. The feeds must not be published directly to your website. Back to questions

I am writing my own client code to download a feed. What kind of authentication should I use? Use HTTP Basic Authentication that is built into your programming language’s library. HTTP Basic Authentication is available on most platforms; for example, Perl, PHP, C or Java. For more information, see “Authentication” in the AP WebFeeds Developer’s Guide. Back to questions

January 25, 2019 Page 10 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Are there any client implementation requirements? Yes. Make sure that your custom program meets the following important requirements: − Allowing for new XML elements and attributes in the feed. Your custom program must allow new XML elements and attributes to be added to the feed XML by the AP. Using XML parsing instead of string parsing is strongly encouraged for the continuous integration of your custom code with WebFeeds. − Following redirects in download links. Since content item download URLs may use redirects for improved performance, your custom program must be configured to follow standard HTTP redirects when downloading content items. − Using XML metadata instead of parsing download URLs. The format of the download URLs may change overtime or may be different based on your own preferences or configuration. Instead of string parsing download URLs, your custom program must use the XML metadata of the content items returned along with the download links; specifically, in the element in AP ATOM and in the element in NITF. Back to questions

Where do I get my list of entitlements? Your list of entitlements is included in the Welcome letter. You can also retrieve this list programmatically from the WebFeeds server at any time by accessing http://syndication.ap.org/AP.Distro.Feed/GetAccountInfo.aspx and using HTTP Basic Authentication to pass your account credentials to the server, similar to feed requests. You can also use optional parameters to request a list of products included in each package in the XML or CSV format. For more information, see “Retrieving Your Entitlements” in the AP WebFeeds Developer’s Guide. Back to questions

Can I use package IDs to request a feed? Yes, package IDs are the same as product IDs: when you specify one or more product or package IDs as the value of the idList parameter in a WebFeeds request, you must specify ‘products’ as the value of the idListType parameter. Back to questions

What syntax and parameters do I use to request a feed? Your feed request must be in the following format: http://syndication.ap.org/AP.Distro.Feed/GetFeed.aspx?idList={idList}&idListType={idListType}&[{Optional Parameters}] Note: The parameters may be specified in any order.

PARAMETER DESCRIPTION EXAMPLES idList* One or more product, package or saved search IDs. Multiple IDs must 1001 be of the same type and must be specified as a comma-delimited list, 3,1184,8385 with no spaces between characters. 517044 idListType* The type of the IDs that are specified as the idList parameter values. products Package and product IDs are of the same type and may be used with savedsearches idListType=products. idList2 Used in requests for one feed combining different types of IDs, in idList=1184& conjunction with the idList parameter. If the idList parameter specifies idListType= one or more product/package IDs, the idList2 parameter must specify products&idList2 one or more saved search IDs, or vice versa. =517057& idListType2 The type of the IDs specified as the idList2 parameter values. idListType2= savedsearches maxItems The maximum number of items to include in the feed. The default is 25. 25 The maximum allowed value is 50 (if you request more than 50 items, only up to 50 items are returned).

January 25, 2019 Page 11 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

PARAMETER DESCRIPTION EXAMPLES maxDateTime The date and time before which the requested content was released, in 2017-02- the format YYYY-MM-DDTHH:mm:SSZ where the value must be in 21T17:52:49Z Coordinated Universal Time (UTC). The default is the time of the request. minDateTime The date and time after which the requested content was released, in the 2017-02- format YYYY-MM-DDTHH:mm:SSZ or 19T15:30:00Z YYYY-MM-DDTHH:mm:SS.msZ, where the value must be in 2017-03-02T Coordinated Universal Time (UTC). The default is three days (72 hours) 21:58:08.187Z prior to the time of the request. The content is available for the last three days only. sequenceNumber A unique sequential number that identifies each feed entry and must be 3830303 used in conjunction with the minDateTime parameter to request content that is newer than the latest sequence number in the previously returned feed. Use the sortOrder=chronological parameter in the request to prevent any missing content. For more information, see “How can I make sure that I am getting only new content with each feed request?” on page 13 of this FAQ. fullContent Inserts full stories into the AP ATOM document (recommended for nitf (same as true) feeds with multiple text stories to avoid numerous server requests to anpa download each full story). Full stories can be in the NITF, ANPA, IPTC iptc or hNews format. hnews showInlineLinks Displays inline links in NITF-formatted stories. true showAnpaLinks Adds links to ANPA files to each story entry. true showIptcLinks Adds links to IPTC 7901 files to each story entry. true showHNewsLinks Adds links to hNews files, which are formatted as XHTML. true showAllFilings Returns all filings of ANPA and/or IPTC-formatted stories. If this true parameter is not used, only the latest filing of a story is returned. For more information, see “How can I make sure that I am getting only new content with each feed request?” on page 13 of this FAQ. sortOrder Sorts the feed items in chronological order, from the oldest at the top of chronological the feed to the newest at the bottom. If this parameter is not specified, the feed items are sorted in reverse chronological order. autoFlatten For feeds containing AP Top Headlines, autoFlatten=false displays only false the parent entry and hides entries for the individual stories included in the Top Headline collection. If this parameter is not specified, the parent entry and the individual story entries are included in the feed. For more information, see “What are AP Top Headlines?” on page 4. compression Compresses the feed to the gzip format and returns a gzip byte stream true instead of the text/XML feed. You must uncompress the returned feed for further processing; for example, using the gzip application. Compressed feeds are recommended in most cases, especially for large requests. showODRL Adds machine-readable restrictions in the ODRL format to each online true video entry. For more information, see “Online Video Restrictions” in AP WebFeeds Developer’s Guide. Back to questions

January 25, 2019 Page 12 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

I am creating a feed and receiving a message that authentication failed. What does this mean? Either the user ID is not valid, or the user ID is not entitled to view the requested products. For help, contact [email protected]. Back to questions

Why is the number of items that I get in the returned feed different from the maxItems value? Since content is available for the last three days only, the feed may include fewer than the specified maximum number of items. If the feed contains AP Top Headlines, it may include more than the specified maxItems value because only the Top Headline parent entry is counted, but not the individual Top Headline stories. Back to questions

How can I make sure that I am getting only new content with each feed request? 1. Check the values of the and elements in the Feed Sequencing section at the top of the AP ATOM feed resulting from your previous request; for example: - urn:publicid:ap.org:421 - - <apxh:div xmlns:apxh="http://www.w3.org/1999/xhtml"> <apxh:span>Business News</apxh:span> </apxh:div> - - 2017-03-03T16:48:58.437Z ...

2. Specify these values as the values of the minDateTime and sequenceNumber parameters respectively in the next request (make sure to specify milliseconds in the value of the minDateTime parameter). To prevent missing content, specify the sortOrder=chronological parameter in the request. You must use the same parameters as in the previous feed request; in particular, the same product and saved search IDs. For example, to request a feed based on these sample values and a product ID 30029, use the following URL: http://syndication.ap.org/AP.Distro.Feed/GetFeed.aspx?idList=30029&idListType=products&minDate Time=2017-03-03T16:48:58.437Z&sequenceNumber=3914494&sortOrder=chronological Important: If you use the sequenceNumber parameter, you must also use the minDateTime parameter, and both values must result from the same previous feed request. Changing the minDateTime value without using the corresponding sequenceNumber value is not allowed. Changing or manipulating either value may produce unexpected results, and some content may be lost or duplicated. Note that when no new content is available, it is normal for the sequenceNumber and minDateTime values returned by the server to remain the same until new content arrives. Make sure to use the sortOrder=chronological parameter in the request to prevent any missing content. For more information, see “Getting Unique Content” in the AP WebFeeds Developer’s Guide. Back to questions

I am interested in specific video renditions. How can I ensure that I receive them? Each video delivered by AP WebFeeds is made available in various renditions (formats, quality and encodings). In order to deliver new content to you as fast as possible, videos are released into WebFeeds as soon as some of the renditions are available and not necessarily when all of the renditions have finished being produced. As more renditions become available, the video entry will appear in your feed again with the same entry ID and Management ID, but with more renditions available for download. Important: To avoid discarding video rendition updates as duplicates and therefore missing some of the

renditions that you may be interested in (for example, MP4), your application must ignore video feed entries until they appear in the feed with the rendition of interest available for download. Back to questions

January 25, 2019 Page 13 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Why am I not getting suggested photos in the ATOM syndication feed (format=6)? If available, links to suggested media (for example, photos linked to a story by AP editors) appear only in the NITF stories delivered in the AP ATOM feeds (format=4). The ATOM syndication feed (format=6) delivers stories in the hNews format; therefore, links to suggested media are not available in the ATOM syndication feed. If you are interested in getting links to suggested stories, consider switching to the AP ATOM feeds (format=4). Back to questions

WEBFEEDS MANAGER AND AGENT ONLY

What are the system requirements for WebFeeds Manager? WF M Portal The WFM portal works with the versions listed below. If you do not have these versions installed, please upgrade. OS DESCRIPTION MS Windows − Microsoft Explorer 8.0+* or 9.0+ − The latest version of Mozilla or Chrome Note: *The site has been optimized for 9.0+, Firefox and Chrome.

Mac OS The latest version of Mozilla Firefox, Apple or Google Chrome Linux The latest version of Mozilla Firefox or Google Chrome

Important: It is also advised that you clear the cache to ensure that the site's functionality works correctly after updating your browser version. To clear the cache: − Internet Explorer: Ctrl+F5 − Firefox/Chrome: Ctrl+Shift+Del (PC and Linux) / Shift+Cmd+Del (Mac) − Safari: Alt (Option)+Cmd+E

WebFeeds Agent − Windows 7, Windows XP, Windows 2000, Mac OS X or CentOS Linux 6.4. Basic agent functionality has been tested on Windows 8, Ubuntu Linux 13.04 and openSUSE Linux 12.1. Although not extensively tested on other platforms, the agent will most likely work on any platform that supports the latest Java Runtime Environment (see below). Important:

− WebFeeds Agent 3.x does not support the Mac PowerPC platform. AP recommends running the agent on an Intel-based machine. − Serial Port Integration is not supported on Linux. For more information, see “Are there any special requirements for Serial Port Integration?” on page 15.

− Java Runtime Environment (JRE). Running the agent requires a compatible Java Runtime Environment (JRE) installation on the computer where you will run the agent. There are many compatible Java versions on many operating systems, and you should choose the one that is the most appropriate for your needs. It is always recommended that you use the newest available release version of the JRE you chose and keep it up to date with the latest security patches and bug fixes. If you are unsure of which JRE to choose for your installation, you can use Oracle’s Open JDK (https://jdk.java.net/), which is free for all commercial use under the GNU General Public License. Oracle also now offers a paid licensed version with support and more frequent updates (https://www.java.com/en/). Mac OS X users can install the latest Java version via Software Update. − Administrator rights on the machine running the agent. For example, your account must be able to install applications and create and delete files and folders.

January 25, 2019 Page 14 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

− If you are running the agent on a Windows system that has User Access Control (UAC), such as Windows 7, Windows Vista or Windows Server 2008, you must run the agent as Administrator or disable UAC. The recommended method for running as Administrator is to launch a command prompt as Administrator and then launch the WebFeedsAgent.jar file from that command prompt; for example: java -jar WebFeedsAgent.jar Note: Windows file systems limit the length of a file path. For more information, see “Why can’t I open, move, copy, rename or delete some of my content files in Windows Explorer?” on page 21 of this FAQ. − On Linux, if your distribution by default does not allow the root user login, you must run the agent from the Terminal as root user. For example: sudo java -jar WebFeedsAgent.jar For more information, see “What is the minimum recommended bandwidth?” on page 3 and “Are there any special requirements for Serial Port Integration?” on page 15. Back to questions

Are there any special requirements for Serial Port Integration? Yes, there are a few additional requirements: − Windows XP/2000 or Mac OS X 10.4 or later. − Cabling and/or hardware adapters necessary to facilitate transfer, such as RS-232 serial null modem cable and a USB to serial port adapter. If the machine hosting the agent has a serial port, you need only a single null modem cable (use a null modem cable with connectors appropriate for your environment). To configure cabling and port adapters: 1. Install any drivers that may be necessary for the adapters you are using. 2. Connect the necessary cables and adapters to the computer hosting the agent and to the front-end system or CMS that receives ANPA and/or IPTC. For more information, see WebFeeds Agent Serial Port Integration for ANPA Quick Start Guide. Back to questions

Is Power PC supported? No, WebFeeds Agent 3.x does not work on the Mac PowerPC platform, and support for WebFeeds Agent 2.3 has been discontinued. AP recommends running the agent on an Intel-based machine. For more information, see "What are the system requirements for WebFeeds Manager?" on page 14 of these FAQs.

How do I get WebFeeds Agent? 1. Sign in to the WFM portal at http://wfm.ap.org using your user name and password provided in the WebFeeds Welcome Letter. 2. Click View Agents at the top to display the View Agents page. 3. Click to display the Get a New Agent page. 4. Click and save the agent jar file. Important: Change the file extension to .jar if the file has a different extension.

Back to questions

What are the minimal settings that I must configure to start downloading content? 1. Sign in to the WFM portal using your WebFeeds user name and password. 2. Click Create new profile on the Manage Profiles page, and then click Create New. 3. On the Create New Profile page, type the name of the profile in the New Profile Name box. 4. Click the Select Content tab, and then select the check boxes for products, packages and/or saved searches for which you wish to download content.

January 25, 2019 Page 15 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

5. Click Save Changes to create your profile. 6. Click View Agents at the top, click , and then click to save the agent jar file. Important: Change the file extension to .jar if the file has a different extension.

7. From your desktop environment, browse to the folder where you saved the agent jar file and double-click the file name to display the WebFeeds Agent window. 8. Enter the same user name and password that you used for signing in to the WFM portal in the User Name and Password boxes. 9. Enter the agent name of your choice in the Agent Name box. The agent name allows you to identify this agent at the portal and must be unique. 10. Click Register to register this agent with the portal. The agent appears in the list of agents on the portal. 11. On the portal, click View Agents at top to display the View Agents page. 12. Under Actions, click Link and then click the name of the profile that you want to link to your agent. 13. In the agent, click Reregister and then click Start Ingesting to start content downloads. Back to questions

Can I change my password? Yes, WebFeeds Manager allows you to change your password. Important:

− The password change applies to all AP portals where your user account is used; for example, AP Exchange, AP Images and AP WebFeeds. − Since all WebFeeds agents store a copy of your account credentials, you must also change your password in all of the agents; otherwise, content ingestion will be interrupted. To change your password: 1. In each of the agents that is currently downloading content using the account for which you want to change the password, click Stop Ingesting in the bottom-right corner to stop content ingestion. 2. On the WFM portal, click Welcome, at top, and then click Change Password:

3. In the Change Password dialog box, read the important information about changing your password. To proceed with the password change, type your current and new passwords in the respective boxes and then click Submit:

January 25, 2019 Page 16 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

4. On the Connecting tab in each of the agents, enter your user name, your new password and agent name, and then click Reregister. Back to questions

How do I filter out duplicate content using WebFeeds Agent? Duplicate content is content that has been ingested more than once within a 24-hour period (the standard news cycle). You may receive duplicate content for a variety of reasons. AP Top Headlines are filed multiple times throughout the day, often with the same stories. AP editors may file the same story for print and online use. The same story or media may appear in multiple entitlements (products, packages or saved searches). Stories may share related media; for example, the same photo may be linked to two different stories about the same news event. WFM allows you to prevent the ingestion of duplicate content. It divides content into two categories: − NITF. Text stories (including AP Top Headlines) and media captions. − Links. Media files (photos, graphics, video and audio) and other optional story formats, such as ANPA, IPTC and hNews. A linked file is considered a duplicate if its content ID matches one ingested earlier. For NITF content, WFM allows you to specify criteria for defining a duplicate. To specify duplicate content settings: 1. On the WFM portal, do one of the following to access the agent details page: − Click View Agents at the top, and then click the Edit link associated with the agent. − Click the agent name under Agents in the left pane. 2. The agent details page is displayed:

3. Click the Duplicate Settings tab to display the duplicate settings:

January 25, 2019 Page 17 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

4. Do one or more of the following: − Under NITF (Stories, Captions, Plain): − To specify criteria for filtering out duplicate NITF content and plain text derived from NITF, select the Discard duplicates with matching check box and one or more of the following check boxes: Body, Headline, Keywords, Revision ID and Slugline. Duplicates can be discarded if one or more of the following match: OPTION METADATA ELEMENT IN THE AP DESCRIPTION ATOM FEED (XPATH) Body //entry/content/nitf/body/ The body content portions of a story or body.content/block/ caption, excluding media links. Note: The element may contain multiple elements.

Headline //entry/apcm:ContentMetadata/ A brief synopsis of a content item that may apcm:HeadLine include Publishing System Versioning information or editorial instructions. Keywords //entry/apcm:ContentMetadata/ A multi-word field used to expedite content apcm:Keywords searching. Revision ID //entry/id A unique ID that changes through each revision of a content item. Slugline //entry/apcm:ContentMetadata/ The story slug. apcm:SlugLine

Tips:

− Not every caption has a headline. If only Headline is selected, then captions without headlines will not be considered duplicates. − If Revision ID is selected, the other criteria (Body, Headline, Keywords or Slugline) are unavailable because changes to these fields result in a new revision. − To save one duplicate per entitlement, select the Save one duplicate per entitlement check box. For example, this may be helpful if you are saving content for each entitlement in a different folder and want one copy of each story to appear in each folder separately. − To save all files associated with duplicate content (for example, related media and the entry XML) but not the NITF and plain text, select the Save associated files from duplicates check box. For example, a duplicate story is sometimes released only because its related media is new, so you can save the newly added related media by selecting this check box and the Discard duplicates check box in the Links (Media, ANPA, IPTC, hNews) section. − Under Links (Media, ANPA, IPTC, hNews): − To filter out duplicate media, ANPA, IPTC and hNews files, select the Discard duplicates check box. − To save one duplicate per entitlement, select the Save one duplicate per entitlement check box. − To clear duplicate definitions saved by the agent, click Reset Duplicate History at the top of the agent details page. The duplicate history for both NITF and links will be cleared at the beginning of the next polling cycle. Note: Making changes to either NITF or links duplicate settings automatically clears the corresponding duplicate history. Changing ingestion time for the agent does not clear the duplicate history. 5. Click Save Changes. Back to questions

January 25, 2019 Page 18 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

How can I post-process individual content files that are downloaded by the agent? If you configure the agent to run your own shell script, it makes one or more calls to your script during a polling session: after a feed is saved and after an entry is processed. When the agent finishes processing an entry, it issues an entry script call. The argument passed to your script consists of IDs that allow you to uniquely identify the entry. The argument is not intended to reference an individual content file. If you are using a script to post-process downloaded content, it is strongly recommended to select a folder hierarchy in which each entry is represented by a folder, such as “One folder per entry” or “One folder per entitlement and per entry” from the Folder Structure list under Download options in the configuration profile. Once you reference a folder that represents an entry, you can post-process all of the files that comprise an entry. For more information, see “Scripting Options” in the AP WebFeeds Manager Online Help. Back to questions

Can I use multiple instances of WebFeeds Agent? WebFeeds Agent allows you to run only one instance of the agent per Java Virtual Machine (JVM). This means that you can use only one agent copy per machine unless you are running commercial virtualization software (such as VMware). Back to questions

What are the naming conventions for files downloaded by the agent? − AP ATOM feed file: The feed file name format can be one of the following: − feed_{ID}_{Date}.xml. If you manage file paths automatically or select the {ID}-{Date}.{ext} option from the Feed file naming format list when configuring paths manually, the feed file name contains the entitlement ID and the date and time when the feed file was saved; for example: feed_30598_2016- 05-15T13-36-51.956Z.xml. − feed_{Date}.xml. If you select the {Date}.{ext} option from the Feed file naming format list, the feed file name contains the date and time when the feed file was saved; for example: feed_2016-05- 15T13-36-51.956Z.xml. − Entry XML metadata file: {ManagementID}-{RevisionID}-entry.xml or {ManagementID}-entry.xml, depending on the option selected from the Naming Format list under Download options in the configuration profile. − Content files: The naming convention depends on the option selected from the Naming Format list under Download options in the configuration profile: − {ManagementID}-{RevisionID}-{ContentID}.{ext}, {ManagementID}-{ContentID}.{ext} or {ContentID}.{ext} can be used to correlate downloaded news item components with their specific metadata in a feed entry. {ManagementID}-{RevisionID}-{ContentID}.{ext} is the default: − {ManagementID} is a globally unique identifier that remains the same through each revision of the news item (for example, a text article) and corresponds to the tag in the AP ATOM feed. − {RevisionID} is a globally unique identifier that changes through each revision of a news item and corresponds to the entry tag in the AP ATOM feed. − {ContentID} is a unique ID that is assigned to a news item component and corresponds to the tag in the AP ATOM feed. MEDIA TYPE NEWS ITEM COMPONENTS Text NITF, ANPA, IPTC and hNews versions of the story Photos, graphics Caption and different versions of the image (main, preview and thumbnail) Audio Audio files in various formats Video Video files in a variety of formats and at different quality levels

January 25, 2019 Page 19 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

− {ext} is a file name ending that designates the and typically consists of three alphanumeric characters (for example, .xml, .jpg, .pdf and .mpg). If a file extension is not available, the default file extension .BIN is used. Note: If you saving ANPA and/or IPTC files using one of the file name format options containing {ContentID}, a unique five-character identifier is added to these options; for example: f007936159d1447ca882c2f38ef3bbe9-110f9.iptc. − {ManagementID}-{RevisionID}-{ComponentIndex}.{ext} saves all revisions of a news item using the component index, a natural number from 1 to the total number of the news item components. Note: If you choose to save all versions of ANPA and/or IPTC-formatted stories, a unique five- character identifier is added to these options. To save all versions, select Save All Versions in the ANPA/IPTC Options section of content settings. − {ManagementID}-{ComponentIndex}.{ext} saves only the latest revision of a news item. The agent overwrites the previous revision of the news item on your local disk when it downloads a more recent revision because both revisions have the same {ManagementID} and therefore the same file name. − {OriginalFileName} saves files with original file names. Because original file names are not unique, downloaded files may be overwritten. Files that have no original file name (for example, XML, ANPA and IPTC files) are saved with the default name: {ManagementID}-{RevisionID}- {ComponentIndex}.{ext}. − {OriginalFileName}-{PartialContentID}.{ext} saves files with original file names plus the last five digits of the {ContentID}; for example, EXAMPLE.JPEG-e68c0.jpg. This option helps prevent overwriting files saved with original file name. Files that have no original file name (for example, XML, ANPA and IPTC files) are saved with the default name: {ManagementID}-{RevisionID}- {ComponentIndex}.{ext}. − {TransmissionReference}-{Role}-{Updated}.{ext}, where − {TransmissionReference} is a code representing the location of original transmission according to practices of the provider. This is the book number for news stories and the transmission reference number for photos and graphics. − {Role} identifies the actual content components of the news item; for example, the body of a news story or a photo, its preview and thumbnail. − {Updated} is the entry’s date updated, which represents the date and time when the news item’s revision became available for syndication. Note: If a transmission reference is not available for a feed entry, content files are saved with the file names in the {ContentID}.{ext} format. − Log files. The name of the active log file is WebFeedsAgent.log. The format of rolled log file names is: − WebFeedsAgent.log.N for files rolled on size, where N is a natural number from 1 to the specified maximum number of log files; for example, WebFeedsAgent.log.1, WebFeedsAgent.log.2 and so on. − WebFeedsAgent.log.YYYY-MM-DD.N for files rolled on date and size, where YYYY denotes the year, MM denotes the month, and DD denotes the day; for example, WebFeedsAgent.log.2016-02-09.1, WebFeedsAgent.log.2016-02-09.2 and so on. Back to questions

What are the SFF files that the agent downloads? When saving files with original file names, you may receive files with a .SFF extension. These files would normally have a .jpg extension (if not saved with original file names). Back to questions

January 25, 2019 Page 20 of 21 AP WEBFEEDS, MANAGER AND AGENT 3.x FAQ

Why can’t I open, move, copy, rename or delete some of my content files in Windows Explorer? Windows file systems limit the length of a file path to approximately 260 characters. Files or folders with paths exceeding this limit cannot be opened, copied, moved or renamed in Windows Explorer. In some cases, Windows Explorer may not be able to delete these files and folders. Therefore, if you are downloading Top Headlines products on Windows and are not post-processing the downloaded files using custom scripts, it is strongly recommended to keep the path as short as possible by changing the default location for saving files to C:\WFA\content. To delete files or folders that you cannot open, contact AP Customer Support at [email protected]. Back to questions

The agent stops running on a remote server when I close the terminal window. What can I do? Assuming you have registered the agent with the portal, start your agent remotely by typing: exec java -jar WebFeedsAgent.jar commandLine > /dev/null & Back to questions

How can I run the agent using a proxy server? Since the agent bypasses proxy server settings configured on your machine, use the following command line syntax to run the agent using a proxy server: java -Dhttp.proxyHost=hostname -Dhttp.proxyPort=port -Dhttp.proxyUser=username -Dhttp.proxyPassword=password –jar WebFeedsAgent.jar Agent_parameters Back to questions

TECHNICAL SUPPORT AND DOCUMENTATION

How do I contact AP Customer Support? − Phone: 877-836-9477 (U.S. toll-free number) or 212-621-7361 (from outside of the U.S.) − E-mail: [email protected] − Website: http://customersupport.ap.org Back to questions

How can I comment on the documentation? To comment on this FAQ, send an e-mail message to [email protected]. Back to questions

January 25, 2019 Page 21 of 21