Package index • bitNLP

Tools for Text Analytics

Tools that support “Natural Language Processing” for text analytics.

bitNLP bitNLP-package: 3rd party package of Bit2R with Text Analytics function

텍스트 데이터 전처리 기능

filter_text(): Filter data based on string matches of text data

replace_text() concat_text() split_text() remove_text(): Replace/remove/join/separate strings in text data

append_userdic_meta(): Write to the user-defined noun dictionary file.

get_meta() set_meta(): Meta information processing for text data pre-processing

get_userdic_meta(): Query the user-defined person dictionary file.

get_spacing(): Korean automatic spacing

get_ngrams(): Tokenization with N-gram

tokenize_noun_ngrams(): N-gram Tokenizer

unnest_noun_ngrams(): Wrapper around unnest_tokens for n-grams of noun

collapse_noun(): Extract Collapsed Noun

텍스트 데이터 탐색 기능

explore_docs(): Text Data Explorer

형태소분석 기능

install_mecab_ko(): Installation of Eunjeonhan morpheme analyzer and dic

regist_mecab_ko(): Register the path where Mecab-Ko is installed

morpho_mecab(): part-of-speech tagger based on mecab-ko morphology analyzer

감성분석 기능

get_opinion(): KOSAC(Korean Sentiment Analysis Corpus) Sentiment Analysis

get_polarity(): KNU Korean Sentiment Dictionary Sentiment Analysis

공동발생 분석

collocate(): Calculate table for co-occurrence analysis

coll_scores(): Calculate t-score and mutual information score

기타 텍스트 처리 기능

has_final_consonant(): Test whether the final consonant of Korean terms

텍스트 데이터

buzz: Naver Cafe Post Scraping Data

rest_area: Highway rest area related buzz

president_speech: President's Speech

movie_ratings_train movie_ratings_test: Naver sentiment movie corpus v1.0

polarity: KOSAC(Korean Sentiment Analysis Corpus) sentiment dictionary

sentiment_dic: KNU Korean Sentiment Dictionary

형태소분석기 사전 관리

add_sysdic(): Add user-defined dictionary files to system dictionary.

edit_termcost(): Modify the word cost of a word in a dictionary definition file.

get_plan_cost(): Search for tokenizer plans based on word cost

append_userdic_meta(): Write to the user-defined noun dictionary file.

create_userdic(): create user dictionary with user-defined dictionary files.

get_userdic_meta(): Query the user-defined person dictionary file.

update_userdic(): update user dictionary with user-defined dictionary files.