1. 01 Apr, 2021 2 commits
  2. 02 Sep, 2020 1 commit
  3. 12 Jul, 2019 1 commit
  4. 09 Oct, 2018 2 commits
  5. 08 Feb, 2018 1 commit
    • Christian Busse's avatar
      Add further sanity checks to flow module · 425b62de
      Christian Busse authored
      The INDEX SORTING LOCATIONS data can be corrupted in ways that are
      not caught by the current routines and will lead to errors in the
      final "cbind" operation. Add further checks to give meaningful
      error messages.
      425b62de
  6. 29 Sep, 2017 1 commit
    • Christian Busse's avatar
      Add sanity check for filenames and metadata · f4cf8e21
      Christian Busse authored
      Currently there are no checks that the runname provided on the
      commandline matches the one in the ".info" metadata files. This
      produces crashes during the QC that are hard to understand/debug
      if this problem is unknown upfront. Extend sanity checks in
      "pre-flight.sh" to cover this and include them into the normal
      run process by default.
      f4cf8e21
  7. 19 Sep, 2017 1 commit
    • Christian Busse's avatar
      Fix QC for single-locus data sets · 74a537e5
      Christian Busse authored
      Currently QC step 2 crashes when processing data which only contains
      reads from a single valid locus. This is due to R's default behavior
      to simplify single matrix columns to vectors, which also discards
      the colnames. Suppress conversion, maintain single-column matrix.
      74a537e5
  8. 07 Aug, 2017 4 commits
    • Christian Busse's avatar
      Fix database INSERTs with long data · 6ac6c8d8
      Christian Busse authored
      Currently, the insert procedures into the 'reads' and the
      'log_table' table rely on the silent truncation of their 'name'
      and 'command' columns by the database, respectively. This creates
      two problems:
      
      1) Starting in version 5.7.5 MySQL has set the mode
      "STRICT_TRANS_TABLES" as default. Therefore INSERTs with data longer
      then the field length will not be truncated anymore but fail and
      produce an error.
      
      2) Read IDs typically contain a common prefix that is shared by all
      reads of a sequencing run. For long read IDs the truncation after
      the first 45 chars can remove the actually distinct/unique part of
      the read ID, thereby creating collisions as the read ID must be
      unique. In this case not all reads will be inserted into the
      database.
      
      Truncate string for `log_table`.`command` to the first 100 chars.
      Truncate string for `reads`.`name` to the last 45 chars.
      6ac6c8d8
    • Christian Busse's avatar
      Update CDR3 motifs in config file · 53c1a866
      Christian Busse authored
      53c1a866
    • Christian Busse's avatar
      Fix QC handling of invalid locus calls · 4450917d
      Christian Busse authored
      Currently the tag aggregation routine of the QC module assumes that
      only valid (i.e. expected) loci are found in the database and both
      the proximal and the distal set of tags contain identical loci.
      However, this might not necessarily be the case due to low-
      confidence BLAST hits or cross-contaminations.
      Fix handling of locus information in QC. Fix log levels in QC.
      4450917d
    • Christian Busse's avatar
      30264619
  9. 31 Jul, 2017 1 commit
  10. 26 Jul, 2017 2 commits
  11. 17 Jul, 2017 1 commit
    • Christian Busse's avatar
      Fix handling of separator characters by Python modules · 45e3a013
      Christian Busse authored
      Currently the Python modules do not tolerate more than one equal
      sign ("=") in each line of the config file. However REs with
      look-ahead functionality require this character. Introduce new
      RE-based line-splitting mechanism. Fix handling of in-line
      comments.
      45e3a013
  12. 18 Jan, 2017 2 commits
    • Christian Busse's avatar
      Fix MySQL incompatibilty for grouping of consensus reads · 32430251
      Christian Busse authored
      Starting in version 5.7.5 MySQL has set the mode
      "ONLY_FULL_GROUP_BY" as default. Therefore SELECTs using
      GROUP BY must not contain any non-aggregated columns.
      
      Fix the affected SQL request.
      32430251
    • Christian Busse's avatar
      Fix and accelerate QC tag statistics function · 3b675620
      Christian Busse authored
      Currently the func.tag.stats function in the R QC library
      performs two DB requests per read. It has turned out that
      this can cause unstable behaviour as read numbers increase.
      
      Replace the current loop over reads with a new requests
      that performs all calculations at once in the DB. This
      reduces the number of requests to one per locus and
      simultaniously speeds up the function by >10x.
      3b675620
  13. 09 Jan, 2017 1 commit
  14. 06 Jan, 2017 1 commit
    • Christian Busse's avatar
      Fix bug in metainfo import · 4cdf8cdd
      Christian Busse authored
      Currently the todb_sampleinfo_highth.pl script assumes that each
      identfier used in *_metainfo.tsv is also present in *_plate.tsv
      and crashes if this assumption is not met. Add condition to
      catch such an event and present a warning to the user.
      
      Fix minor mistakes in debugging output and comments.
      4cdf8cdd
  15. 05 Jan, 2017 1 commit
    • Christian Busse's avatar
      Make receptor type selectable via config file · 74dd84cf
      Christian Busse authored
      config: Add new key 'receptor_type', move 'species' key further
      to the top.
      
      Makefile: Evaluate 'config' file and use receptor_tpye key to
      switch processing between Ig and TCR. Add sanity checks for
      command line parameters. Detect config file in current and
      parent directory, prioritize current over parent.
      74dd84cf
  16. 31 Dec, 2016 2 commits
    • Christian Busse's avatar
      Merge branch 'tcr_process' · 44662597
      Christian Busse authored
      config: Combine lists of matrix batches.
      44662597
    • Christian Busse's avatar
      Accelerated and TCR-enabled read mapping · eed98463
      Christian Busse authored
      Mapping of dual-tagged reads to physical positions is often the
      most time consuming step of data processing and does not
      parallelize well due to locks on database tables. In addition,
      in the current version an individual instance of the script does
      not restrict itself to the locus that was initially passed to it
      via the command line. This can create non-deterministic behavior,
      when processing of multiple loci is parallelized. Finally also this
      script requires modification for TCR processing. Changes:
      
      Collect all seq_ids for a given well_id or consensus_id, then
      perform the SQL UPDATE in a single statement (currently each
      line is updated individually).
      
      Fix the SQL statement selecting unprocessed well_id and include
      the assigned locus as criterium.
      
      Add new locus suffixes for the well_id string to facilitate
      processing of TCR sequences.
      eed98463
  17. 30 Dec, 2016 3 commits
    • Christian Busse's avatar
      Fix handling of config file in-line comments · c395e12f
      Christian Busse authored
      Currently processing of config file in-line comments is based on
      a space between the value and any further text. However, the
      hash/pound sign should always indicate a comment unless escaped.
      
      Add a RegExp for correct processing of config file lines.
      c395e12f
    • Christian Busse's avatar
      Support for human TCR sequences · df525df1
      Christian Busse authored
      Human TCR sequences can require more complex motifs to identify
      the CDR3 and the J segment, including RegExp look-aheds. This
      requires proper quoting and thus makes the handling of theses
      variables more complex to the user.
      
      To mitigate this problem, introduce basic sanity checks for motif
      RegExp to todb_CDR_FWR.pl. Add human TCR motifs to config, update
      list of matrix versions.
      df525df1
    • Christian Busse's avatar
      Fixes and further changes for TCR processing · 3b5ef5e2
      Christian Busse authored
      Fix IgBLAST parameters in config file (no quotes). Add capability
      to handle TCR data to todb_sampleinfo_highth.pl and QC scripts.
      Add an additional script to create spatials for TCR data.
      3b5ef5e2
  18. 10 Aug, 2016 1 commit
  19. 13 Jun, 2016 1 commit
  20. 06 May, 2016 1 commit
    • Christian Busse's avatar
      Fix designation of custom database · a48fc902
      Christian Busse authored
      The custom database is currently referred to as "NCBIm38", which is
      in analogy to the "NCBIm37" assembly it was originally based on.
      However, the current assembly is called "GRCm38", which can lead
      to confusion. Use correct assembly name in all files.
      a48fc902
  21. 03 May, 2016 1 commit
    • Christian Busse's avatar
      Fix parsing of IgBLAST region information · f6054e2b
      Christian Busse authored
      Currently the last query block of IgBLAST output is not parsed
      for region location downstream (and including) FWR3. Add a further
      condition to process last block correctly.
      
      Remove empty R history file.
      f6054e2b
  22. 25 Apr, 2016 1 commit
  23. 14 Apr, 2016 4 commits
  24. 22 Jan, 2016 4 commits