wiki:XTandemSearchEngineInterface

X!Tandem Search Engine Interface

Description

This note describes the Proteios' interface to a GPM (Global Proteome Machine) web-based X!Tandem search engine, i.e. how to run X!Tandem search jobs for Proteios spectrum file items and have the result files automatically uploaded. This functionality is intended to be used together with the import of X!Tandem result files, but is independent of the latter.

Introduction

X!Tandem is a search engine for protein/peptide identification, using mass spectrum files as input. Several formats are allowed for the spectrum files, including PKL, MGF, and mzData. The search procedure is controlled by a number of parameters, normally given in an XML file. Results from a search are also given in an XML file. X!Tandem can be installed as a local application, but GPM also offers a web-based installation.

Features of the Proteios X!Tandem Search Engine Interface

  • An interface to an X!Tandem search engine, specifically a GPM (Global Proteome Machine) web-based X!Tandem search engine for starting X!Tandem search jobs for Proteios spectrum file items and have the result files automatically uploaded. A GPM installation is preferred, as the search itself can be performed on a separate system, which is an advantage from a performance perspective, and system specific issues for the installation are avoided through the use of a web interface.
  • The Proteios X!Tandem search functionality is intended to be used together with the import of X!Tandem result files, but is independent of the latter. If desired, spectrum files can be exported from Proteios, X!Tandem searches performed externally, and the output XML files uploaded to Proteios, where the search results can be imported.
  • The Proteios interface allows the convenience of starting search jobs for several spectrum files in one operation, as long as the same search parameters are used.
  • The result files will be automatically uploaded to Proteios, where the source spectrum file is shown in the description field for each file item. The file type will be set to "Tandem result", for easy identification.
  • An X!Tandem search parameter set can be created with default values, or imported from an X!Tandem parameter file item (either from another X!Tandem parameter set, or from an uploaded X!Tandem XML input file).
  • X!Tandem search parameters can be edited from a GUI in Proteios, before a search is started.
  • An X!Tandem search parameter set can be exported to an X!Tandem XML input file, for reference purposes or use with external X!Tandem search engines.

Requirements

  1. Proteios installation (see Proteios 2 Installation).
  2. An accessible web-based GPM X!Tandem search engine, i.e. a URL to the web site.

Quick Install

  1. Install Proteios from binary archive or source code (see Proteios 2 Installation).
  2. Locate the Proteios X!Tandem search properties file template xtandem.properties.in, and copy it to a file named xtandem.properties in the same directory.
  3. Edit the xtandem.properties file by setting the value of xtandem.gpm.server.url to the URL for the GPM web site to use for the search, xtandem.gpm.result.filename.prefix to the prefix used for result files, xtandem.gpm.server.timediff.hours to the time difference in hours of a GPM server in another time zone, and xtandem.gpm.server.timediff.correction.minutes to the difference in system clock settings. The other settings can normally be used with their default values. For a local GPM install the commented settings in the xtandem.properties file can be used. If you are not sure about the GPM result filename prefix, just "GPM" should work in most cases. Otherwise, run a search through the web interface and check how the result is named.

How to Run X!Tandem Search Jobs from Proteios

Creating/Editing an X!Tandem Parameter Set

  1. Select View -> Search Setup -> XTandem.
  2. Click on "New X!Tandem Parameter Set".
  3. Enter name and description, then click "Save" to copy the default X!Tandem parameter set, or click on "Select File to copy X!Tandem Parameters from".
  4. A new X!Tandem parameter set is now created, together with a related X!Tandem parameter file.
  5. Click on an X!Tandem parameter set to inspect/edit the parameter values. Click on "Save" to save the new values.

Typical Start of X!Tandem Search Jobs

  1. Select the Project to use.
  2. Select <Active Project> -> Files, then select the input spectrum files by clicking their check boxes.
  3. Click on "Extensions", then select "Use spectrum file(s) for X!Tandem search".
  4. Select an X!Tandem parameter set by clicking its check box.
  5. Click on "Next - Edit parameters before search".
  6. Inspect and optionally edit X!Tandem parameters.
  7. Click "Save and Search X!Tandem" to save parameter values and create search jobs, one for each spectrum file.
  8. Wait for the search jobs to complete, and the result files to be automatically uploaded to Proteios. Take a cup of coffee/lunch break/vacation, depending on how many search jobs you started, and the complexity of the searches.
  9. Inspect the result files, and import selected results.

Comments

  1. If you know that a selected X!Tandem parameter set has correct values, instead of clicking on "Next - Edit parameters before search", you can click on "Next - Create search job[s]" to skip the parameter page.
  2. When editing parameters, note that "Taxon", under sub header "X!Tandem - Protein", must be set to a valid value for a search to be performed.
  3. When editing parameters, no entry is needed for "Spectrum path", under sub header "X!Tandem - Spectrum", as the spectrum file data is taken directly from the selected spectrum file item.
  4. It is possible to change the scoring algorithm from "native" to "k-score". For the search to work, X!Tandem needs to have "k-score" pluggable scoring installed, which is generally not the case for public GPM servers. More information about pluggable X!Tandem scoring here. For local X!Tandem/GPM installations you can add k-score by recompiling the tandem.exe binary with m_score_k.h and m_score_k.cpp in the tandem source directory.

Traceability

  1. An X!Tandem parameter file item has a reference in its description field to the X!Tandem parameter file its values were copied from.
  2. An X!Tandem parameter set has a direct link to an X!Tandem parameter file item, shown in the table entry for the former when clicking on "X!Tandem parameter sets" in the left side menu.
  3. An X!Tandem search job has references in its description field to the X!Tandem parameter set and spectrum file used.
  4. An X!Tandem search job has references in its status field to the X!Tandem parameter set and X!Tandem result file (optionally the latter is replaced by an error message if the search could not be performed).
  5. An uploaded X!Tandem result file has references in its description field to the X!Tandem parameter set and spectrum file used. Its file type is set to "Tandem result".
  6. For easier identification of an uploaded GPM result file, the filename of the original result file is prefixed with the base name of the spectrum file (filename without file extension) plus an underscore character '_', and file extension ".xml" is exchanged for ".xt.xml".

Background Information

Terms used

X!Tandem
A search engine for protein/peptide identification, using mass spectrum files as input. See References for links.
GPM
Global Proteome Machine, a web interface to X!Tandem. See References for links.
Core File
A File item in the Proteios database
Core Directory
A Directory item in the Proteios database
HTTP
Hypertext Transfer Protocol, a standard protocol for communication over intranets and the World Wide Web

Basic Operation

The X!Tandem search parameter sets are stored internally as X!Tandem XML parameter files. When an X!Tandem parameter set is loaded for inspection or editing, it is retrieved from the corresponding XML parameter file. When an edited X!Tandem parameter set is saved, it is exported back to the XML parameter file. When the X!Tandem parameter set values are inspected directly prior to starting a search job, the parameter values are always exported back to the XML parameter file, in case some value has been changed.

When an X!Tandem search is performed from Proteios, HTTP form data for the search parameter values and the full spectrum file are sent to the GPM web site, in a way resembling how HTTP form data is sent when a user clicks the submit button for performing a search from a GPM web page.

The GPM result file is normally named "GPM003XXXXXXXX.xml", where "XXXXXXXX" is a serial number set at run time to separate the result files, e.g. "GPM00300000092.xml" for serial number 92. However, it may also have a time stamp suffix, e.g. "GPM00300000092.2008_05_12_14_49_26.t.xml" for a result file created on May 12, 2008, at 14:49:26 local time. Proteios obtains the base name of the result file (file name without time stamp suffix) from the GPM output, and then checks if a file with this name can be found on the GPM server. If not, it looks for a file with a time stamp suffix, using the current time. Since there is a delay between the creation of the result file on the GPM server and the output info obtained by the Proteios search job, the first time stamp used by Proteios might be inaccurate. For a GPM server on another system, differences in time zones and system clock settings also require correction of the time stamp. If no output file is found, the check is repeated for earlier and later time stamps, up to a number of seconds (normally 90 seconds). If a corresponding result file is found on the GPM server, it is uploaded to Proteios, otherwise a note is made in the status field for the job item that the user has to upload the file "GPM00300000092*.xml" manually.

References

  1. The binary X!Tandem distribution is described in http://thegpm.org/TANDEM/instructions.html.
  2. The installation FAQ page for the binary X!Tandem distribution can be found at http://thegpm.org/TANDEM/tandem_install_faq.html.
  3. The GPM web interface to X!Tandem is described in http://thegpm.org/GPM/instructions.html.
  4. The installation FAQ page for the GPM web interface to X!Tandem can be found at http://thegpm.org/GPM/gpm_install_faq.html.
  5. Pluggable scoring for X!tandem is described in this paper and k-score can be downloadedhere
Last modified 6 years ago Last modified on Mar 9, 2011, 11:50:57 AM