id
processing priority
4
site type
3 (personal blog or private political site, e.g. Blogspot, Substack, also small blogs on own domains)
review version
11
html import
20 (imported)
first seen date
2024-08-25 20:01:00
expired found date
-
created at
2024-08-25 20:01:00
updated at
2025-05-06 22:00:18
length
19
crc
32072
tld
2211
nm parts
0
nm random digits
0
nm rare letters
0
is subdomain of id
13642151 (wordpress.com)
previous id
0
replaced with id
0
related id
-
dns primary id
0
dns alternative id
0
lifecycle status
0 (unclassified, or currently active)
deleted subdomains
0
page imported products
0
page imported random
0
page imported parking
0
count skipped due to recent timeouts on the same server IP
0
count content received but rejected due to 11-799
0
count dns errors
0
count cert errors
0
count timeouts
0
count http 429
0
count http 404
0
count http 403
0
count http 5xx
0
next operation date
-
server bits
—
server ip
-
mp import status
20
mp rejected date
-
mp saved date
-
mp size orig
255622
mp size raw text
123914
mp inner links count
15
mp inner links status
10 (links queued, awaiting import)
title
Blue Collar Bioinformatics
description
Note: new posts have moved to http://bcb.io/ Please look there for the latest updates and comments
image
site name
Blue Collar Bioinformatics
author
updated
2026-02-26 06:08:26
raw text
Blue Collar Bioinformatics | Note: new posts have moved to http://bcb.io/ Please look there for the latest updates and comments Blue Collar Bioinformatics Note: new posts have moved to http://bcb.io/ Please look there for the latest updates and comments Validating multiple cancer variant callers and prioritization in tumor-only samples with 6 comments Overview The post discusses work validating multiple cancer variant callers in bcbio-nextgen using a synthetic reference call set from the ICGC-TCGA DREAM challenge . We’ve previously validated germline variant calling methods , but cancer calling is additionally challenging. Tumor samples have mixed cellularity due to contaminating normal sample, and consist of multiple sub-clones with different somatic variations. Low-frequency sub-clonal variations can be critical to understand disease progression but are more difficult to detect with high sensitivity and precision. Publicly available whole genome truth sets like ...
redirect type
0 (-)
block type
0 (no issues)
detected language
1 (English)
category id
index version
1
spam phrases
0
text nonlatin
0
text cyrillic
0
text characters
65535
text words
18063
text unique words
2865
text lines
1706
text sentences
798
text paragraphs
255
text words per sentence
22
text matched phrases
0
text matched dictionaries
0
links self subdomains
0
links other subdomains
90 - info.bina.com, bcbio-nextgen.readthedocs.org, hsphbio.ghost.io, gla.ac.uk, bioinformatics.oxfordjournals.org, cancer.sanger.ac.uk, exac.broadinstitute.org, ebi.ac.uk, useast.ensembl.org, cnvkit.readthedocs.org, boto.readthedocs.org, s3.amazonaws.com, d0.awsstatic.com, gatkforums.broadinstitute.org, well.ox.ac.uk, cdn.vanillaforums.com, ccr.coriell.org, blog.goldenhelix.com, nar.oxfordjournals.org, ftp.1000genomes.ebi.ac.uk, bedtools.readthedocs.org, lobstr.teamerlich.org, envgen.nox.ac.uk, wiki.debian.org, conda.binstar.org, repo.continuum.io, gatk.vanillaforums.com, biologie.hu-berlin.de, basespace.illumina.com, lists.open-bio.org, mobyle.pasteur.fr, blogs.nopcode.org, ivory.idyll.org, dm.genomespace.org, cran.r-project.org, pandas.pydata.org
links other domains
121 - synapse.org, bcb.io, genomeinabottle.org, icgc.org, biorxiv.org, bina.com, astrazeneca.com, broadinstitute.org, genomebiology.com, quinlanlab.org, mobio.com, clinvar.com, 1000genomes.org, horizondx.com, ansible.com, biogenidec.com, massgeneral.org, infoq.com, galaxyproject.org, arvados.org, sbgenomics.com, taverna.org.uk, open-bio.org, bioconductor.org, coreos.com, clusterk.com, htslib.org, illumina.com, melissagymrek.com, genomicsandhealth.org, 01.org, biomedcentral.com, cloudbiolinux.org, minke-informatics.co.uk, ipython.org, curoverse.com, edgebio.com, biostars.org, j.mp, biojava.org, pymol.org, edamontology.org, bioplanet.com, ruffus.org.uk, dell.com, eucalyptus.com, openstack.org, novocraft.com, gkno.me, horde.net, moosefs.org, genomemedicine.com, repeatmasker.org, f1000research.com
links spam adult
0
links spam random
0
links spam expired
0
links ext activities
31
links ext ecommerce
links ext finance
0
links ext crypto
0
links ext booking
0
links ext news
0
links ext leaks
0
links ext ugc
224 - s0.wp.com, wp.me, s1.wp.com, twitter.com, i.imgur.com, imgur.com, en.wikipedia.org, jermdemo.blogspot.com, docs.google.com, joachimbaran.wordpress.com, youtube.com, mussolblog.wordpress.com, linkedin.com, basecallbio.wordpress.com, feeds2.feedburner.com, wordpress.com
links ext klim
0
links ext generic
6
dol status
0
dol updated
2026-02-26 06:08:26
rss status
32 (unknown)
rss found date
2024-08-29 03:52:04
rss size orig
199596
rss items
10
rss spam phrases
0
rss detected language
1 (English)
inbefore feed id
-
inbefore status
0 (new)
sitemap path
sitemap status
40 (completed successful import of reports.txt file to table in_pages)
sitemap review version
2
sitemap urls count
59
sitemap urls adult
0
sitemap filtered products
0
sitemap filtered videos
0
sitemap found date
2024-08-27 04:52:12
sitemap process date
2024-08-27 04:52:12
sitemap first import date
-
sitemap last import date
2025-05-06 22:00:18