透過 HTTPS 可使用的通訊方法:

  • POST: 目前只支援 POST 方法。

上傳內容說明:

  • username*: 您在本站註冊時所使用的帳號 (email)。
  • api_key*: 在本站購買斷詞服務額度,完成付費後取得的一個具有 31 字符長度的字串。
  • input_str*: 將要送上 Articut 進行斷詞暨詞性標記處理的文字。注意!每次最大長度不得超過 2000 個字符。
  • version: 指定版本是選用的。如果此項留白,或指定 "latest",則 Articut 將使用最新版本的演算法對您上傳的文字進行斷詞。此外,您也能指定 Articut 的演算法版本。例如,若在此項輸入字串 "v001",則將會使用 v001 版本的斷詞演算法對您上傳的文字進行斷詞。
  • level: 可為 "lv1" 或 "lv2"。指定為 lv1 時,將直接透過句子本身的語法結構進行推算,可視為「沒有百科知識」,只有語法能力的斷詞結果。若指定為 lv2 時,則會額外引入卓騰的百科知識庫輔助運算。
  • user_defined_dict_file: 使用者自定詞典,必須是 dictionary 格式。(e.g. UserDefinedDICT = {"key": ["value1", "value2",...],...})。
  • opendata_place: 布林型,預設為 False。政府開放平台 OpenData 中存有「交通部觀光局蒐集各政府機關所發佈空間化觀光資訊」。Articut 可取用其中的資訊,並標記為 <KNOWLEGED_place>

回傳內容說明:

  • status: 布林型。其值可為 TrueFalse.
  • msg: 字串型。可能為以下的文字:
    • Success!: 順利完成斷詞作業。
    • Specified version does not exist.: 無法找到您指定的演算法版本。請再檢查一次您指定的演算法版本值。
    • Specified level does not exist.: 無法找到您指定的知識程度。知識程度只能為 "lv1" 或 "lv2"。(目前僅支援 "lv1")
    • Authtication failed.: 無法驗証您的帳號。請再檢查一次您使用的帳號是否正確。
    • API_Key failed.: 無效的 api_key。請再檢查一次您的 api_key 是否正確。
    • Your input_str is too long. (over 2000 characters.):
    • Insufficient word count balance.: 您帳號下的字數餘額不足以處理本次斷詞需求。
    • Internal server error. (Your word count balance is not consumed, don't worry. System will reboot in 5min, please try again later.): 嗯…似乎我們的伺服器出了點狀況。我們正在努力修復中,5 分鐘內會自動重啟,請稍後再試一次。別擔心,在無法正常回傳斷詞結果的情況下,您帳號的餘額不會被扣除。
    • Invalid content_type.: 上傳格式必須為 Json 格式 (application/json)。
    • Invalid arguments.: 上傳參數錯誤,請重新檢查上傳時的參數是否符合規則名稱。
    • UserDefinedDICT Parsing ERROR. (Please check your the format and encoding.): 使用者自定詞典無法載入,請檢查格式 (Dict) 或編碼 (UTF-8) 是否正確。
    • Maximum UserDefinedDICT file size exceeded! (UserDefinedDICT file shall be samller than 10MB.): 使用者自定詞典檔案大小超過 10MB。
  • result_pos: 列表型。列表中含有每一句各自分開的斷詞結果,詞組前後另外加上了詞性標記 (Part-Of-Speech)。
  • result_segmentation: 字串型。完整的輸入文句已經斷詞處理並以斜線 ( / ) 標出詞彙斷點。回傳時以字串回傳。
  • exec_time: 浮點數型。本次斷詞作業耗費的伺服器時間。
  • version: 字串型。本次斷詞作業所使用的演算法版本。
  • level: 字串型。本次斷詞作業所使用的知識能力等級。
  • word_count_balance: 整數型。您帳號下剩餘可用的字數值。
  • <IDIOM>: 成語
  • <FUNC_determiner>: 定冠詞 (或中文系稱的「定語」)
  • <TIME_holiday>: 和節日相關的時間。
  • <TIME_justtime>: 和現在或瞬時相關的時間。
  • <TIME_day>: 和以「天」為單位相關的時間。
  • <TIME_week>: 和以「週」為單位相關的時間。
  • <TIME_month>: 和以「月」為單位相關的時間。
  • <TIME_season>: 和以「季」為單位相關的時間。
  • <TIME_year>: 和以「年」為單位相關的時間。
  • <TIME_decade>: 和以「比年還要長的時間」為單位相關的時間。

    以上幾個您都可以簡單地把它視為「時間副詞」。

  • <ENTITY_num>: 單純數字表示

  • <ENTITY_classifier>: 量詞 (或中文系稱的「分類詞」)
  • <ENTITY_measurement>: 量測詞 (表示是一個測量值。例如「一公斤」、「30公分」…等)
  • <ENTITY_person>: 名詞,且系統推測應該指某個「人類」。(以漢人常見三字名、單名為主)
  • <ENTITY_pronoun>: 代名詞。(若有需要,可再細分「專指代名詞」(e.g., 爸爸) 或「泛指代名詞」(e.g., 老公公))
  • <ENTITY_possessive>: 所有格名詞。
  • <ENTITY_noun>: 系統已認得的名詞。
  • <ENTITY_nounHead>: 名詞組的中心語。
  • <ENTITY_nouny>: 系統推測應該是名詞。
  • <ENTITY_oov>: 系統不知道是什麼,但把它當名詞用。

    以上幾個您都可以簡單地把它視為「名詞」。有幾個詞組組合規則如下:

    • 連續的 nouny 可以被視為是同一個大名詞組 (e.g., 咖啡(ENTITY_nouny) 杯(ENTITY_nouny) 就直接視為「咖啡杯」(ENTITY_NP);
    • 遇到 nounHead 時,會向前疊加成為名詞組。(e.g., 小(MODIFIER) 紅(MODIFIER_color) 帽(ENTITY_nounHead) 會被疊加成為「小紅帽(ENTITY_NP)」;遊樂(ACTION_verb) 場(ENTITY_nounHead) 會被疊加成「遊樂場」(ENTITY_NP)
    • 相對於前項,nouny 則不會跟動詞 (ACTION_verb) 做疊加成名詞組,只會在擔任動詞的受詞,而成為動詞組。(e.g., 認識(ACTION_verb) 字(ENTITY_nouny) 會變成「認識字」(ACTION_VP)
  • <MODIFIER>: 形容詞及副詞。

  • <MODIFIER_color>: 顏色形容詞。
  • <FUNC_degreeHead>: 形容詞組的程度中心語(很、極、非常…等)。表示形容詞到這裡就不會再疊加了。其旁邊形容詞會和此程度中心語形成一個用來「描述程度」的修飾、形容用語。

  • <ASPECT>: 時態標記 (了、著…等)

  • <QUANTIFIER>: 量化詞標記 (都、全…等)
  • <LOCATION>: 地名
  • <RANGE_locality>: 地名範圍標記
  • <RANGE_period>: 時間範圍標記
  • <FUNC_modifierHead>: 形容詞及副詞組的中心語。
  • <MODAL>: 情態標記詞 (e.g., 可以、能、會…)
  • <AUX>: 助動詞 (e.g., 是、為…)

  • <CLAUSE_AnotAQ>: A-not-A 問句

  • <CLAUSE_YesNoQ>: 是非問句
  • <CLAUSE_WhoQ">: 「誰」問句
  • <CLAUSE_WhatQ>: 「物」問句
  • <CLAUSE_WhereQ>: 「何地」問句
  • <CLAUSE_WhenQ>: 「何時」問句
  • <CLAUSE_HowQ>: 「程度/過程」問句
  • <CLAUSE_WhyQ>: 「原因」問句
  • <CLAUSE_Particle>: 沒什麼特別意義,就只是一個句子裡的小元素。(e.g., 啊、啦、喔…)

  • <FUNC_negation>: 否定功能詞

  • <FUNC_conjunction>: 連接功能詞
  • <FUNC_inter>: 外向功能詞 (完整語意需在本句以外滿足。e.g., 然而…)
  • <FUNC_inner>: 內向功能詞 (完整語意可在本句以內滿足。e.g., 在…)
  • <ACTION_lightVerb>: 輕動詞 (e.g., 被、把、弄…),
  • <ACTION_verb>: 動詞
  • <ACTION_quantifiedVerb>: 量化動詞 (表示該動作只做了一定程度的量。e.g., 看一看、瞧瞧、嚐嚐看…等)
  • <KNOWLEDGE_addTW>: 台灣地址
  • <KNOWLEDGE_url>: 網址
  • <KNOWLEDGE_place>: 政府開放平台中的觀光景點
  • <UserDefined>: 使用者自定義的詞彙

Articut API URL

POST https://api.droidtown.co/Articut/API/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "我想過過過兒過過的日子。"
}

回傳內容 (JSON 格式)

{
    "status": True,
    "msg": 'Success!',
    "result_pos": ["<ENTITY_pronoun>我</ENTITY_pronoun><ACTION_verb>想</ACTION_verb><ACTION_verb>過</ACTION_verb><ACTION_verb>過</ACTION_verb><ENTITY_pronoun>過兒</ENTITY_pronoun><ACTION_verb>過</ACTION_verb><ASPECT>過</ASPECT><FUNC_inner>的</FUNC_inner><ENTITY_noun>日子</ENTITY_noun>","。"],
    "result_segmentation": "我/想/過/過/過兒/過/過/的/日子/。/",
    "exec_time": 0.08697724342346191,
    "version": "latest",
    "level": "lv1",
    "word_count_balance": 99988
}

Versions API URL

POST https://api.droidtown.co/Articut/Versions/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all"
}

回傳內容 (JSON 格式)

{
    "status": True,
    "msg": "Success!",
    "versions": [
        {
            "version": "latest",
            "release_date": "2019-02-22",
            "level": ["lv1"],
        },
        {   
            "version": "v100"
            "release_date": "2019-02-22",
            "level": ["lv1"],
        },
        {
            "version": "v002"
            "release_date": "2019-02-20",
            "level": ["lv1"],
        }
    ]
}

因為 Articut 只處理「語言知識」而不處理「百科知識」。
我們提供「使用者自定義」詞彙表的功能,並標記為 <UserDefined> 。使用 Dictionary 格式,請自行編寫。


Articut API URL

POST https://api.droidtown.co/Articut/API/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "我正在計劃地球人類補完計劃",
    "version": "v132",
    "level": "lv1",
    "user_defined_dict_file": {"地球人類補完計劃": ["人類補完計劃", "補完計劃"]}
}

回傳內容 (JSON 格式)

{
    "exec_time": 0.013453006744384766,
    "level": "lv1",
    "msg": "Success!",
    "result_pos": ["<ENTITY_pronoun>我</ENTITY_pronoun><ASPECT>正在</ASPECT><ACTION_verb>計劃</ACTION_verb><UserDefined>地球人類補完計劃</UserDefined>"],
    "result_segmentation": "我/正在/計劃/地球人類補完計劃/",
    "status": True,
    "version": "v132",
    "word_count_balance": 99987
}

政府開放平台中存有「交通部觀光局蒐集各政府機關所發佈空間化觀光資訊」。Articut 可取用其中的資訊,並標記為 <KNOWLEDGE_place>


Articut API URL

POST https://api.droidtown.co/Articut/API/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "花蓮的原野牧場有一間餐廳",
    "version": "v137",
    "level": "lv1",
    "opendata_place": true
}

回傳內容 (JSON 格式)

{
    "exec_time": 0.013453006744384766,
    "level": "lv1",
    "msg": "Success!",
    "result_pos": ["<LOCATION>花蓮</LOCATION><FUNC_inner>的</FUNC_inner><KNOWLEDGE_place>原野牧場</KNOWLEDGE_place><ACTION_verb>有</ACTION_verb><ENTITY_classifier>一間</ENTITY_classifier><ENTITY_noun>餐廳</ENTITY_noun>"],
    "result_segmentation": "花蓮/的/原野牧場/有/一間/餐廳/",
    "status": True,
    "version": "v137",
    "word_count_balance": 99988
}

Articut Addons 進階功能需註冊使用

上傳內容說明

  • result_pos: Articut 斷詞結果標記。
  • index_with_pos: 此參數為布林型態,預設為 True。計算所擷取的字串位置時,是否包含詞性標記 (POS)。
  • func:
    • get_all: 取出以下所有參數的結果。
    • get_person: 取出斷詞結果中的人名 (person)。每個句子內的人名為一個 list。
    • get_person_and_pronoun: 取出斷詞結果中的人名 (person) 與代名詞 (pronoun)。每個句子內的人名與代名詞為一個 list。
    • get_content_word: 取出斷詞結果中的實詞 (content word)。每個句子內的實詞為一個 list。
    • get_verb_stem: 取出斷詞結果中的動詞 (verb)。此處指的是 ACTION_verb 標記的動詞詞彙。每個句子內的動詞為一個 list。
    • get_noun_stem: 取出斷詞結果中的名詞 (noun)。此處指的是 ENTITY_nounENTITY_nounyENTITY_nounHeadENTITY_oov 標記的名詞詞彙。每個句子內的名詞為一個 list。
    • get_event: 取出斷詞結果中的事件 (event)。一個事件包含「一個動詞」以及受詞 (若有的話)。每個句子內事件列為一個 list。
    • get_time: 取出斷詞結果中的時間 (time)。每個句子內的時間列為一個 list。
    • get_opendata_place: 取出斷詞結果中的景點 (KNOWLEDGE_place)。此處指的是景點 KNOWLEDGE_place 標記的非行政地點名稱詞彙,例如「鹿港老街」、「宜蘭運動公園」。每個句子內的景點為一個 list。
    • get_location_stem: 取出斷詞結果中的地理位置 (location)。此處指的是地理位置標記的行政區地名詞彙,例如「台北」、「桃園」、「墨西哥」。每個句子內的地理位置列為一個 list。
    • get_location_event: 取出斷詞結果中的「地方 - 事件」 (location-event)。「某時 - 某地 - 發生某事」,其中「某時」為可有可無。get_time 取出「某時」 get_location_stemget_opendata_place 取出「某地」 get_event 取出「發生某事」。再將以上三者結合成結果。
    • get_question: 取出斷詞結果中含有 <CLAUSE_Q> 標籤的句子。例如「是非問句:你認識他嗎?」
    • get_addTW: 取出斷詞結果中含有 <KNOWLEDGE_addTW> 標籤的台灣地址字串。例如「台北市中山區民權東路二段109號」。

Articut API URL

POST https://api.droidtown.co/Articut/Addons/

上傳內容 (JSON 格式) usernameapi_key為必填

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "result_pos": ["<MODIFIER>剛剛</MODIFIER><ACTION_verb>得知</ACTION_verb><KNOWLEDGE_place>435藝文特區</KNOWLEDGE_place><AUX>是</AUX><ENTITY_classifier>個</ENTITY_classifier><ACTION_verb>遛</ACTION_verb><ENTITY_nouny>小孩</ENTITY_nouny><FUNC_inner>的</FUNC_inner><MODIFIER>好</MODIFIER><ENTITY_noun>地方</ENTITY_noun>",
                   ",",
                   "<ENTITY_pronoun>你</ENTITY_pronoun><CLAUSE_YesNoQ><AUX>是</AUX><FUNC_negation>否</FUNC_negation></CLAUSE_YesNoQ><ACTION_verb>知道</ACTION_verb><TIME_day>傍晚</TIME_day><MODAL>可以</MODAL><ACTION_verb>到</ACTION_verb><KNOWLEDGE_place>觀音亭</KNOWLEDGE_place><ACTION_verb>去</ACTION_verb><ACTION_verb>看</ACTION_verb><ENTITY_nouny>夕陽</ENTITY_nouny><CLAUSE_Particle>喔</CLAUSE_Particle>",
                   "!",
                   "<TIME_day>今日</TIME_day><TIME_day>傍晚</TIME_day><FUNC_inner>在</FUNC_inner><LOCATION>新竹市</LOCATION><LOCATION>北區</LOCATION><ACTION_verb>溜</ACTION_verb><ENTITY_nouny>小狗</ENTITY_nouny>",
                   "。"],
    "func": ["get_all"],
    "index_with_pos": true
}

回傳內容 (JSON 格式)

{
    "msg": "Success!",
    "status": true,
    "results": {"addtw_list": [[], [], [], [], [], []],
                "content_word_list": [[[10, 12, "剛剛"],
                                       [36, 38, "得知"],
                                       [159, 160, "遛"],
                                       [188, 190, "小孩"],
                                       [241, 242, "好"],
                                       [266, 268, "地方"]],
                                      [],
                                      [[122, 124, "知道"],
                                       [191, 192, "到"],
                                       [257, 258, "去"],
                                       [285, 286, "看"],
                                       [314, 316, "夕陽"]],
                                      [],
                                      [[132, 133, "溜"], [161, 163, "小狗"]],
                                      []],
                "event_list": [[[146, 205, "遛小孩"]],
                               [],
                               [[244, 272, "去"], [272, 331, "看夕陽"]],
                               [],
                               [[119, 178, "溜小狗"]],
                               []],
                "location_event_list": [{"event": ["遛小孩"],
                                         "site": [["435藝文特區"]],
                                         "time": [[]]},
                                        {"event": ["去", "看夕陽"],
                                         "site": [["觀音亭"]],
                                         "time": [["傍晚"]]},
                                        {"event": ["溜小狗"],
                                         "site": [["新竹市", "北區"]],
                                         "time": [["今日", "傍晚"]]}],
                "location_stem_list": [[],
                                       [],
                                       [],
                                       [],
                                       [[82, 85, "新竹市"], [106, 108, "北區"]],
                                       []],
                "noun_stem_list": [[[188, 190, "小孩"],
                                    [266, 268, "地方"]],
                                   [],
                                   [[314, 316, "夕陽"]],
                                   [],
                                   [[161, 163, "小狗"]],
                                   []],
                "opendata_place_list": [[[69, 76, "435藝文特區"]],
                                        [],
                                        [[223, 226, "觀音亭"]],
                                        [],
                                        [],
                                        []],
                "person_and_pronoun_list": [[],
                                            [],
                                            [[16, 17, "你"]],
                                            [], 
                                            [], 
                                            []],
                "person_list": [[], [], [], [], [], []],
                "question_list": [[],
                                  [],
                                  [["<CLAUSE_YesNoQ>", "你是否知道傍晚可以到觀音亭去看夕陽喔"]],
                                  [],
                                  [],
                                  []],
                "time_list": [[],
                              [],
                              [[148, 150, "傍晚"]],
                              [],
                              [[10, 12, "今日"],
                               [33, 35, "傍晚"]],
                              []],
                "verb_stem_list": [[[36, 38, "得知"], 
                                    [159, 160, "遛"]],
                                   [],
                                   [[122, 124, "知道"],
                                    [191, 192, "到"],
                                    [257, 258, "去"],
                                    [285, 286, "看"]],
                                   [],
                                   [[132, 133, "溜"]],
                                   []]}
}

Articut Toolkit

基於 TF-IDF 算法的關鍵詞抽取

上傳內容說明

  • result_segmentation: Articut 斷詞結果,提取關鍵詞的文本。
  • top_k: 此參數為整數型態,預設為 50。提取幾個 TF-IDF 的關鍵詞。
  • with_weight: 此參數為布林型態,預設為 False。為是否返回關鍵詞權重值。
  • allow_pos: 此參數為列表型態,預設為空值,亦即全部抽取。抽取指定詞性。

Articut API URL

POST https://api.droidtown.co/Articut/Toolkit/TFIDF/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "result_segmentation": "沒有/人/可以/決定/你/的/命運/,/命運/在/自己/的/手/上/。/",
    "with_weight": true
}

回傳內容 (JSON 格式)

{
    "msg": "Success!",
    "status": true,
    "tfidf": [["命運", 0.27356173082825286],
              ["你", 0.13678086541412643],
              ["手", 0.11362471190151249],
              ["決定", 0.10007923043569088],
              ["自己", 0.06731240487628462],
              ["人", 0.05667373578657066]]}
}

基於 TextRank 算法的關鍵詞抽取

將待抽取關鍵詞的文本斷詞。 以固定的窗格大小 (預設值為 5,通過 span 屬性調整),詞之間的共現關係,建構出不帶權圖。 計算途中節點的 PageRank。 算法論文:TextRank: Bringing Order into Texts

上傳內容說明

  • result_pos: Articut 斷詞結果標記,提取關鍵詞的文本。
  • top_k: 此參數為整數型態,預設為 10。提取幾個關鍵詞。
  • with_weight: 此參數為布林型態,預設為 False。為是否返回關鍵詞權重值。
  • allow_pos: 此參數為列表型態,預設為空值,亦即全部抽取。抽取指定詞性。

Articut API URL

POST https://api.droidtown.co/Articut/Toolkit/TextRank/

上傳內容 (JSON 格式)

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "result_pos": ["<FUNC_negation>沒有</FUNC_negation><ENTITY_nouny>人</ENTITY_nouny><MODAL>可以</MODAL><ACTION_verb>決定</ACTION_verb><ENTITY_pronoun>你</ENTITY_pronoun><FUNC_inner>的</FUNC_inner><ENTITY_nouny>命運</ENTITY_nouny>",
                   ",",
                   "<ENTITY_oov>命運</ENTITY_oov><FUNC_inner>在</FUNC_inner><ENTITY_pronoun>自己</ENTITY_pronoun><FUNC_inner>的</FUNC_inner><ENTITY_nouny>手</ENTITY_nouny><RANGE_locality>上</RANGE_locality>",
                   "。"],
    "with_weight": true
}

回傳內容 (JSON 格式)

{
    "msg": "Success!",
    "status": true,
    "textrank": [["命運", 5.591625666787958],
                 ["自己", 3.4637188376903927],
                 ["你", 3.4637188376903927],
                 ["決定", 3.4637188376903927],
                 ["手", 2.959546855830475],
                 ["人", 2.9595468558304745]]}
}

Allowed HTTPs requests method:

  • POST: We only support POST method at this time.

Description Of Your Payload Content:

  • username*: The email address of your account used to login this website.
  • api_key*: A 31 character length hashed string.
  • input_str*: The string to be processed. For each request, the length of the string must be fewer than 2000 characters.
  • version: Version is optional. If a blank value "" or a explicit "latest" string is given, then the latest version of the Articut algorithm will be applied to process your input_str. Otherwise, when a specified version string such as "v001" is given, Articut algorithm version v001 will be apllied to process your input_str.
  • level: Level is optional.
  • user_defined_dict_file: File is optional. Format must be dictionary. (e.g. UserDefinedDICT = {"key": ["value1", "value2",...],...})
  • opendata_place: Boolean type, Default value is False. Source from government OpenData.

Description Of The Returned Dictionary:

  • status: Boolean type. The value will be either True or False.
  • msg: String type. The value and the meaning are listed:
    • Success!: Process successfully performed.
    • Specified version does not exist.: There's no corresponding Articut version to the one you specified. Please check the version you specified or just leave it blank for the latest version of Articut algorithm.
    • Specified level does not exist.:
    • Authtication failed.: We cannot identify your account. Please check your username again.
    • API_Key failed.: The api_key value we received is invalid. Please check the api_key again.
    • Your input_str is too long. (over 2000 characters.)
    • Insufficient word count balance.: The word count balance under your account is not enough to process the string specified. Please purchase more quota.
    • Internal server error. (Your word count balance is not consumed, don't worry. System will reboot in 5min, please try again later.): Oops, there seems to be something wrong with our service. We are working hard to have it back online as soon as possible, please try again later. Don't worry, we didn't consume the word count balance under your account since this request is unsuccessful.
    • Invalid content_type.: The content_type must be json. (application/json)
    • Invalid arguments.
    • UserDefinedDICT Parsing ERROR. (Please check your the format and encoding.)
    • Maximum UserDefinedDICT file size exceeded! (UserDefinedDICT file shall be samller than 10MB.)
  • result_pos: List type. The segmentation result with POS tag of the string processed. Each sentence is seperated as a string packed into a list.
  • result_segmentation: String type. The segmentaion result without POS tag of the string processed. The whole input is concatenate as a string.
  • exec_time: Float type. The execution time of the request at our server with second as the unit.
  • version: String type. The algorithm version that Articut uses to process the string sent.
  • level: String type.
  • word_count_balance: Integer Type. The remaining word count quota under your account.
  • <IDIOM>: idiom and phrases.
  • <FUNC_determiner>: determiners.
  • <TIME_holiday>: time expressions about holidays.
  • <TIME_justtime>: time expressions related to now or short present time.
  • <TIME_day>: time expressions about day(s).
  • <TIME_week>: time expressions about week(s).
  • <TIME_month>: time expressions about month(s).
  • <TIME_season>: time expressions about season(s).
  • <TIME_year>: time expressions about year(s).
  • <TIME_decade>: time expressions that indicate a time longer than a decade.

    Basically, you may simply take all listed as time expressions/temporary adverbs in the sentence.

  • <ENTITY_num>: numbers.

  • <ENTITY_classifier>: classifiers.
  • <ENTITY_measurement>: measurements of some dimention. i.e., weight, length... etc.
  • <ENTITY_person>: nouns. Articut presumes it should refer to a person.
  • <ENTITY_pronoun>: pronouns.
  • <ENTITY_possessive>: possessive expressions.
  • <ENTITY_noun>: nouns.
  • <ENTITY_nounHead>: the head of a noun phrase.
  • <ENTITY_nouny>: presumed noun.
  • <ENTITY_oov>: Out-of-vocabulary. Articut cannot identify the POS of the phrase, but suggests that it should be a noun.

    You may take the tags listed as noun. There are few nominal composition rules can be used:

    • Serial "nouny" can be taken as a noun phrase (e.g., 咖啡(ENTITY_nouny) 杯(ENTITY_nouny) can be taken as "咖啡杯" (ENTITY_NP);
    • A nounHead makes a noun phrase with whatever goes before it (e.g., 小(MODIFIER) 紅(MODIFIER_color) 帽(ENTITY_nounHead) can be seen as "小紅帽" (ENTITY_NP)」;遊樂(ACTION_verb) 場(ENTITY_nounHead) can be seen as "遊樂場" (ENTITY_NP)
    • Comparing to a nounHead, a nouny does not makes a noun phrase with a verb (ACTION_verb) goes before it. A nouny can only be the object to the verb that goes before it and make it a verb phrase (e.g., 認識(ACTION_verb) 字(ENTITY_nouny) will turn into "認識字" (ACTION_VP)
  • <MODIFIER>: adjectives and adverbs.

  • <MODIFIER_color>: colors.
  • <FUNC_degreeHead>: the head of some modifier phrase that act with the modifier aloneside to form a Degree Phrase.
  • <ASPECT>: aspectual markers.
  • <QUANTIFIER>: quantifier markers.
  • <LOCATION>: locations.
  • <RANGE_locality>: a range related to some location.
  • <RANGE_period>: a range related to some time.
  • <FUNC_modifierHead>: head of a adjective or adverb.
  • <MODAL>: modal.
  • <AUX>: auxiliary.

  • <CLAUSE_AnotAQ>: A-not-A interrogative sentence.

  • <CLAUSE_YesNoQ>: Yes/No interrogative sentence.
  • <CLAUSE_WhoQ">: "WHO" interrogative sentence.
  • <CLAUSE_WhatQ>: "WHAT" interrogative sentence.
  • <CLAUSE_WhereQ>: "WHERE" interrogative sentence.
  • <CLAUSE_WhenQ>: "WHEN" interrogative sentence.
  • <CLAUSE_HowQ>: "HOW" interrogative sentence.
  • <CLAUSE_WhyQ>: "WHY" interrogative sentence.
  • <CLAUSE_Particle>: sentential particles with no significant meanings.

  • <FUNC_negation>: negation function words.

  • <FUNC_conjunction>: conjunction function words.
  • <FUNC_inter>: Outward function words that indicate the full meaning of the sentence is beyond this sentence.
  • <FUNC_inner>: Inward function words that indicate the full meaning of the sentence can be satisifed within this sentence.
  • <ACTION_lightVerb>: light verbs.
  • <ACTION_verb>: verbs.
  • <ACTION_quantifiedVerb>: quantified verbs.
    • <KNOWLEDGE_addTW>: Taiwan address.
  • <KNOWLEDGE_url>: URL link.
  • <KNOWLEDGE_place>: Scenic spots listed in the Governmental Open Data Platform.
  • <UserDefined>: Phrases defined by users.

Articut API URL

POST https://api.droidtown.co/Articut/API/

Sending Request

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "我想過過過兒過過的日子。"
}

Returned JSON

{
    "status": True,
    "msg": "Success!",
    "result_pos": ["<ENTITY_pronoun>我</ENTITY_pronoun><ACTION_verb>想</ACTION_verb><ACTION_verb>過</ACTION_verb><ACTION_verb>過</ACTION_verb><ENTITY_pronoun>過兒</ENTITY_pronoun><ACTION_verb>過</ACTION_verb><ASPECT>過</ASPECT><FUNC_inner>的</FUNC_inner><ENTITY_noun>日子</ENTITY_noun>","。"],
    "result_segmentation": "我/想/過/過/過兒/過/過/的/日子/。/",
    "exec_time": 0.08697724342346191,
    "version": "latest",
    "level": "lv1",
    "word_count_balance": 99988
}

Versions API URL

POST https://api.droidtown.co/Articut/Versions/

Sending Request

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all"
}

Returned JSON

{
    "status": True,
    "msg": "Success!",
    "versions": [
        {
            "version": "latest",
            "release_date": "2019-02-22",
            "level": ["lv1"],
        },
        {   
            "version": "v100"
            "release_date": "2019-02-22",
            "level": ["lv1"],
        },
        {
            "version": "v002"
            "release_date": "2019-02-20",
            "level": ["lv1"],
        }
    ]
}

Since Articut only deal with "linguistic competence" but not "encyclopedic knowledge", a "UserDefined" dictionary function is provided to serve the need.
Words listed in the "UserDefined" dictionary will be marked as <UserDefined> under Key-value dictionary format.
Users can edit the dictionary basing their domain knowledge on their own.


Articut API URL

POST https://api.droidtown.co/Articut/API/

Sending Request

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "我正在計劃地球人類補完計劃",
    "version": "v132",
    "level": "lv1",
    "user_defined_dict_file": {"地球人類補完計劃": ["人類補完計劃", "補完計劃"]}
}

Returned JSON

{
    "exec_time": 0.013453006744384766,
    "level": "lv1",
    "msg": "Success!",
    "result_pos": ["<ENTITY_pronoun>我</ENTITY_pronoun><ASPECT>正在</ASPECT><ACTION_verb>計劃</ACTION_verb><UserDefined>地球人類補完計劃</UserDefined>"],
    "result_segmentation": "我/正在/計劃/地球人類補完計劃/",
    "status": True,
    "version": "v132",
    "word_count_balance": 99987
}

Scenic spots listed in the governmental open data platform will be marked as <KNOWLEDGE_place>.


Articut API URL

POST https://api.droidtown.co/Articut/API/

Sending Request

{
    "username": "test@email.com",
    "api_key": "anapikeyfordocthatdoesnwork@all",
    "input_str": "花蓮的原野牧場有一間餐廳",
    "version": "v137",
    "level": "lv1",
    "opendata_place": true
}

Returned JSON

{
    "exec_time": 0.013453006744384766,
    "level": "lv1",
    "msg": "Success!",
    "result_pos": ["<LOCATION>花蓮</LOCATION><FUNC_inner>的</FUNC_inner><KNOWLEDGE_place>原野牧場</KNOWLEDGE_place><ACTION_verb>有</ACTION_verb><ENTITY_classifier>一間</ENTITY_classifier><ENTITY_noun>餐廳</ENTITY_noun>"],
    "result_segmentation": "花蓮/的/原野牧場/有/一間/餐廳/",
    "status": True,
    "version": "v137",
    "word_count_balance": 99988
}