Tools for Text Analytics

Tools that support “Natural Language Processing” for text analytics.

bitTA-package

3rd party package of Bit2R with TA function

텍스트 데이터 전처리 기능

filter_text()

Filter data based on string matches of text data

replace_text() concat_text() split_text() remove_text()

Replace/remove/join/separate strings in text data

get_meta() set_meta()

Meta information processing for text data preprocessing

get_spacing()

Korean automatic spacing

get_ngrams()

Tokenization with N-gram

tokenize_noun_ngrams()

N-gram Tokenizer

unnest_noun_ngrams()

Wrapper around unnest_tokens for n-grams of noun

collapse_noun()

Extract Collapsed Noun

텍스트 데이터 탐색 기능

explore_docs()

Text Data Explorer

형태소분석 기능

install_mecab_ko()

Installation of Eunjeonhan morpheme analyzer and dic

regist_mecab_ko()

Register the path where Mecab-Ko is installed

morpho_mecab()

part-of-speech tagger based on mecab-ko morphology analyzer

감성분석 기능

get_opinion()

KOSAC(Korean Sentiment Analysis Corpus) Sentiment Analysis

텍스트 데이터

buzz

Naver Cafe Post Scraping Data

rest_area

Highway rest area related buzz

president_speech

President's Speech

movie_ratings_train movie_ratings_test

Naver sentiment movie corpus v1.0

polarity

KOSAC(Korean Sentiment Analysis Corpus) sentiment dictionary

sentiment_dic

KNU Korean Sentiment Dictionary