SUBTLEX-CH: A database of Chinese word and character frequencies
SUBTEL-CH (Cai & Brysbaert 2010) is a database of Chinese word and character frequencies based on a corpus of film and television subtitles (46.8 million characters, 33.5 million words).
Data Tables: https://doi.org/10.1371/journal.pone.0010729.s002
(Note: the zipped file includes three files (SUBTLEX-CH-WF, SUBTLEX-CH-CHR, SUBTLEX-CH-WF_PoS) providing word and character frequency measures based on a corpus of film subtitles (33.5 million words or 46.8 million characters).
Cai, Qing, and Marc Brysbaert. 2010. “SUBTLEX-CH: Chinese Word and Character Frequencies Based on Film Subtitles.” PLOS ONE 5 (6): e10729. https://doi.org/10.1371/journal.pone.0010729.