TODOs Remove stderr from tail since that confuses some users: tail: cannot open .bedGraph file for reading no such file or directory Or remove head/tail entirely for portability. C++/R function that computes total Poisson loss using bedGraph and bed file input -- this may be more numerically stable than the cost that results from PeakSegFPOP. For example > system("head labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=0_loss.tsv labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=5.83435542852498e-10_loss.tsv labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=0.000172220048971656_loss.tsv") ==> labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=0_loss.tsv <== 0 135835 67917 3661581 -6.0018530211077560921 -21976270.986880756915 infeasible 2.8921437994722953846 6 ==> labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=5.83435542852498e-10_loss.tsv <== 5.8343554285249800245e-10 135831 67915 3661581 -6.0018530211010858721 -21976270.986895956099 infeasible 2.8940798153034301698 6 ==> labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=0.000172220048971656_loss.tsv <== 0.00017222004897165599582 123445 61722 3661581 -6.0018501174476162063 -21976270.98465982452 infeasible 2.9924208443271766988 7 > there seems to be a difference between PeakSegFPOP and manual computation of the cost, at either the fifth or sixth decimal place: penalty 0 5.8343554285249800245e-10 0.000172220048971656 PeakSegFPOP -21976270.9868 80756915 -21976270.9868 95956099 -21976270.98465 982452 manual -21976270.9868 76085401 -21976270.9868 76085401 -21976270.98465 505615 Note above there is no difference in the manual cost between 0 and 5.8e-10, but there is a difference between the PeakSegFPOP cost, at the fifth decimal place. This is a numerical error: the model with 0 penalty should always have at least as low of cost as any other models. But in the PeakSegFPOP numbers, the model with penalty=5.8e-10 has a slightly lower cost. Test via R/data.table code: cov.dt <- fread("labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph") setnames(cov.dt, c("chrom", "chromStart", "chromEnd", "count")) cov.dt[, chromStart1 := chromStart +1L] setkey(cov.dt, chromStart1, chromEnd) segs.dt <- fread("labels/H3K4me3_XJ_immune/samples/tcell/McGill0107/problems/chr22:16847850-20509431/coverage.bedGraph_penalty=5.83435542852498e-10_segments.bed") setnames(segs.dt, c("chrom", "segStart", "segEnd", "status", "mean")) segs.dt[, segStart1 := segStart + 1L] setkey(segs.dt, segStart1, segEnd) over.dt <- foverlaps(cov.dt, segs.dt, nomatch=0L) sprintf("%.12f", over.dt[, PeakSegDP::PoissonLoss(count, mean, chromEnd-chromStart)]) 2022.2.1 2020.8.13 DESCRIPTION: https for URL/BugReports, Suggests: markdown. 2019.12.9 Various fixes to support paths with spaces and parens. fread(file=..) to avoid interpreting a file name with spaces as a command. fread.last calls normalizePath before shQuote, so that tilde is expanded to the home dir. fread.last works with paths containing spaces and parens. 2019.12.2 test/bugfix for PeakSegFPOP_vec, pen.num passed along to PeakSegFPOP_df instead of always using 10.5 for penalty. 2019.9.27 db file may be specified. 2019.9.10 Change-Point pkg title for CRAN. inlinedocs update \\t in Rd. 2019.8.9 PR#7 bugfix for fread.last on platforms which return initial whitespace. 2019.7.29 PR#6 Valgrind write(buf) points to uninitialised byte(s) fixed by not serializing the entire PoissonLossPieceLog struct (of which there were 4 un-initialized bytes). Now we instead serialize only the members which are necessary for decoding: max_log_mean, data_i, prev_log_mean. ChIPreads data set. numeric penalty value allowed for PeakSegFPOP_dir problem.PeakSegFPOP -> PeakSegFPOP_dir problem.sequentialSearch -> sequentialSearch_dir future.apply::future_lapply instead of parallel::mclapply. Vignette: examples. Vignette showing worst case asymptotic complexity. Vignette showing spatial correlation. 2019.6.5 do not write first up cost, which is undefined. 2019.5.24 PR#5 check Inf string for windows. 2018.12.11 PR#4 if data all the same then do not run DP. 2018.11.28 PR#2 Remove dependency on Berkeley DB. 2018.11.20 PR#3 problem.PeakSegFPOP deletes lock file before running algo. PeakSegFPOP_disk errors (instead of crashing R) if there is a lock file. 2018.11.14 export fread.first/last check for empty loss/timing files 2018.10.31 Use wc/nrows/skip in fread.first/last rather than fread("head/tail -1 file") 2018.10.30 PKG_LIBS in Makevars: remove -ldb_cxx -ldb, add -pthread 2018.10.26 add -ldb_cxx -ldb to Makevars for solaris. in SystemRequirements, link to source, name debian packages. cast log(int) to double, const DbException& e to avoid solaris warn/err. 2018.10.25 C++ code writes number of bedGraph lines to loss file. problem.PeakSegFPOP outputs two data.tables rather than three (timings and loss are combined). 2018.09.28 Forked from PeakSegPipeline. problem.PeakSegFPOP: before was checking start/end pos in segments file with start/end pos in directory name. Now checking first/last line in coverage.bedGraph. No longer import namedCapture.