Java 字符存储相关的类:String, StringBuffer, StringBuilder, CharBuffer, CharSequence, ByteBuffer

博客首页 » Java 字符存储相关的类:String, StringBuffer, StringBuilder, CharBuffer, CharSequence, ByteBuffer

发布于 20 Dec 2013 06:22
标签 blog
在Java中,字符存储相关的类有String, StringBuffer, StringBuilder, CharBuffer, CharSequence以及可以存储原始数据的ByteBuffer。

  • String - 不可变字符串类,最方便,效率低
  • StringBuffer - 线程安全的字符串缓冲,比String功能差,但是快
  • StringBuilder - 线程安全的字符串缓冲,与StringBuffer功能相同,但是比StringBuffer更快
  • CharBuffer - 字符缓冲,没有字符串操作功能
  • CharSequence - 字符序列 interface,是所有String相关的接口
  • ByteBuffer - 字节缓冲,没有字符集

Java: StringBuffer to byte[] without toString
http://stackoverflow.com/questions/19472011/java-stringbuffer-to-byte-without-tostring

The title says it all. Is there any way to convert from StringBuilder to byte[] without using a String in the middle?

The problem is that I'm managing REALLY large strings (millions of chars), and then I have a cycle that adds a char in the end and obtains the byte[]. The process of converting the StringBuffer to String makes this cycle veryyyy very very slow.

Is there any way to accomplish this? Thanks in advance!

—-

1

The closest you can get is to get a char[] array. StringBuffer#getChars(int, int, char[], int) – Martijn Courteaux Oct 19 at 22:55
2
why not use CharBuffer instead? And then do "charBuffer.array()"? – tolitius Oct 19 at 22:57
2
Can you clarify why you need to store all these big strings in memory? Is this something a user is waiting on? Could this instead become a MapReduce or Spark job? I just wonder if maybe this question is a symptom of an architectural design smell. – Vidya Oct 19 at 23:07

Also, you say StringBuilder in the question but StringBuffer in the title. If you're going to do this, that will make a big difference. Go StringBuilder. – Vidya Oct 19 at 23:09

—-
As many have already suggested, you can use the CharBuffer class, but allocating a new CharBuffer would only make your problem worse.

Instead, you can directly wrap your StringBuilder in a CharBuffer, since StringBuilder implements CharSequence:

Charset charset = StandardCharsets.UTF_8;
CharsetEncoder encoder = charset.newEncoder();

// No allocation performed, just wraps the StringBuilder.
CharBuffer buffer = CharBuffer.wrap(stringBuilder);

byte[] bytes = encoder.encode(buffer).array();

—-

For starters, you should probably be using StringBuilder, since StringBuffer has synchronization overhead that's usually unnecessary.

Unfortunately, there's no way to go directly to bytes, but you can copy the chars into an array or iterate from 0 to length() and read each charAt().

—-

What are you trying to accomplish with "million of chars"? Are these logs that need to be parsed? Can you read it as just bytes and stick to a ByteBuffer? Then you can do:

buffer.array()
to get a byte[]

Depends on what it is you are doing, you can also use just a char[] or a CharBuffer:

CharBuffer cb = CharBuffer.allocate(4242);
cb.put("Depends on what it is you need to do");

Then you can get a char[] as:

cp.array()
It's always good to REPL things out, it's fun and proves the point. Java REPL is not something we are accustomed to, but hey, there is Clojure to save the day which speaks Java fluently:

user=> (import java.nio.CharBuffer)
java.nio.CharBuffer

user=> (def cb (CharBuffer/allocate 4242))
#'user/cb

user=> (-> (.put cb "There Be") (.array))
#<char[] [C@206564e9>

user=> (-> (.put cb " Dragons") (.array) (String.))
"There Be Dragons"

Reference:
http://www.techscore.com/tech/Java/JavaSE/NIO/1-2/


本页面的文字允许在知识共享 署名-相同方式共享 3.0协议和GNU自由文档许可证下修改和再使用,仅有一个特殊要求,请用链接方式注明文章引用出处及作者。请协助维护作者合法权益。


系列文章

文章列表

  • Java 字符存储相关的类:String, StringBuffer, StringBuilder, CharBuffer, CharSequence, ByteBuffer

这篇文章对你有帮助吗,投个票吧?

rating: 0+x

留下你的评论

Add a New Comment