Java读取和操作上G文本数据
3,456 阅读
在处理文本时,经常遇到超过1g存储的数据,直接简单的读取,可能遇到java空间不足的问题,为解决此问题,可将大文本数据按照行进行切分为很多块,并将每一块存储为一个文本
public class BigDataRead {
public static void main(String[] args) throws IOException {
long timer = System.currentTimeMillis();
int bufferSize = 20 * 1024 * 1024;//设读取文件的缓存为20MB
//建立缓冲文本输入流
File file = new File("I:/qianyang/预测数据/user_contentdetection_qianyang");
FileInputStream fileInputStream = new FileInputStream(file);
BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream);
//注意这里有时会乱码,根据自己的文本存储格式,进行调整
InputStreamReader inputStreamReader = (bufferedInputStream,);
(inputStreamReader, bufferSize);
;
;
fileLines / splitNum;
( ; i <= splitNum; ++i){
( ( ( ( + i + )),));
;
( ; lineCounter < perSplitLines && (line = input.readLine())!= ; ++lineCounter)
{
output.append(line + );
}
output.flush();
output.close();
output = ;
}
input.close();
timer = System.currentTimeMillis() - timer;
System.out.println( + timer);
}
}
