Changes between Version 27 and Version 28 of WikiStart


Ignore:
Timestamp:
08/06/15 17:01:04 (9 years ago)
Author:
admin
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • WikiStart

    v27 v28  
    66#!html
    77<table style="border-spacing: 1em"><tr>
     8
     9<td class="app" style="background-color:#000080 ; background-image:url('/chrome/site/justext_nb.png')">
     10<p><a href="/wiki/Justext">
     11JusText is a HTML boilerplate removal tool. It can strip navigation links, headers, footers, etc. from HTML pages and leave just regular text containing full sentences.</a><p>
     12<p>
     13<a class="lnk" href="http://is.muni.cz/th/45523/fi_d/phdthesis.pdf">Paper</a>
     14|
     15<a class="lnk" href="/wiki/Justext/Cite">Cite</a>
     16|
     17<a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
     18</p>
     19</td>
     20
     21<td class="app" style="background-color:#800000 ; background-image:url('/chrome/site/_nb.png')">
     22<p><a href="/wiki/Chared">
     23Chared is a tool for detecting the character encoding of a text in a known language. It contains models for a wide range of languages.</a><p>
     24<p>
     25<a class="lnk" href="#">Paper</a>
     26|
     27<a class="lnk" href="/wiki/Chared/Cite">Cite</a>
     28|
     29<a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
     30</p>
     31</td>
     32
     33</tr><tr>
     34
     35<td class="app" style="background-color:#800080 ; background-image:url('/chrome/site/_nb.png')">
     36<p><a href="/wiki/SpiderLing">Spiderling is a web spider for linguistics. It can crawl text-rich parts of the web and collect a lot of data suitable for text corpora.
     37</a><p>
     38<p>
     39<a class="lnk" href="http://nlp.fi.muni.cz/~xsuchom2/papers/PomikalekSuchomel_SpiderlingEfficiency.pdf">Paper</a>
     40|
     41<a class="lnk" href="/wiki/SpiderLing/Cite">Cite</a>
     42|
     43<a class="lnk" href="http://www.gnu.org/licenses/gpl.txt">Licence</a>
     44</p>
     45</td>
    846
    947<td class="app" style="background-color:#008000 ; background-image:url('/chrome/site/onion_nb.png')">
     
    1957</td>
    2058
    21 <td class="app" style="background-color:#800000 ; background-image:url('/chrome/site/unitok_nb.png')">
     59</tr><tr>
     60
     61<td class="app" style="background-color:#808000 ; background-image:url('/chrome/site/unitok_nb.png')">
    2262<p><a href="/wiki/Unitok">
    2363Unitok is a universal text tokeniser with specific settings for many languages. It can turn plain text into a sequence of newline-separated tokens (“vertical” format), while preserving XML-like tags containing metadata.</a></p>
     
    3171</td>
    3272
    33 </tr><tr>
    34 
    35 <td class="app" style="background-color:#000080 ; background-image:url('/chrome/site/justext_nb.png')">
    36 <p><a href="/wiki/Justext">
    37 JusText is a HTML boilerplate removal tool. It can strip navigation links, headers, footers, etc. from HTML pages and leave just regular text containing full sentences.</a><p>
    38 <p>
    39 <a class="lnk" href="http://is.muni.cz/th/45523/fi_d/phdthesis.pdf">Paper</a>
    40 |
    41 <a class="lnk" href="/wiki/Justext/Cite">Cite</a>
    42 |
    43 <a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
    44 </p>
    45 </td>
    46 
    47 <td class="app" style="background-color:#800080 ; background-image:url('/chrome/site/_nb.png')">
    48 <p><a href="/wiki/SpiderLing">Spiderling is a web spider for linguistics. It can crawl text-rich parts of the web and collect a lot of data suitable for text corpora.
    49 </a><p>
    50 <p>
    51 <a class="lnk" href="http://nlp.fi.muni.cz/~xsuchom2/papers/PomikalekSuchomel_SpiderlingEfficiency.pdf">Paper</a>
    52 |
    53 <a class="lnk" href="/wiki/SpiderLing/Cite">Cite</a>
    54 |
    55 <a class="lnk" href="http://www.gnu.org/licenses/gpl.txt">Licence</a>
    56 </p>
    57 </td>
    58 
    59 </tr><tr>
    60 
    61 <td class="app" style="background-color:#808000 ; background-image:url('/chrome/site/_nb.png')">
    62 <p><a href="/wiki/Chared">
    63 Chared is a tool for detecting the character encoding of a text in a known language. It contains models for a wide range of languages.</a><p>
    64 <p>
    65 <a class="lnk" href="#">Paper</a>
    66 |
    67 <a class="lnk" href="/wiki/Chared/Cite">Cite</a>
    68 |
    69 <a class="lnk" href="http://opensource.org/licenses/BSD-3-Clause">Licence</a>
    70 </p>
    71 </td>
    72 
    73 <td class="app" style="background-color:#000000 ; background-image:url('/chrome/site/_nb.png')">
     73<td class="app" style="background-color:#008080 ; background-image:url('/chrome/site/_nb.png')">
    7474<p><a href="http://nlp.fi.muni.cz/trac/noske">NoSketch Engine is the open-sourced little brother of the corpus querying system Sketch Engine.
    7575</a><p>