字符串
在Java编程语言中,字符可以使用基本数据类型char来保存,在 Java 中字符串属于对象,Java 提供了 String 类来创建和操作字符串。
操作字符串常用的有三种类:String、StringBuilder、StringBuffer
接下来看看这三类常见用法
String
java">String value = new String("测试下");
System.out.println(value);
特点:
不可变
java">String value_a = "测试下原来值";
value_a = "改变了";
System.out.println(value_a);
实际打印
改变了
看起来好像改变了,那string为啥还具有不可变的特点呢
那咱来看看以上程序执行过程中value_a的指向
可以看到原来value_a指向“测试下原来值”字符串,后来有指向了“改变了”
为啥String不可变,咱进入String源码看看咋实现的
java">public final class String
implements java.io.Serializable, Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** Cache the hash code for the string */
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability */
private static final long serialVersionUID = -6849794470754667710L;
/**
* Class String is special cased within the Serialization Stream Protocol.
*
* A String instance is written into an ObjectOutputStream according to
* <a href="{@docRoot}/../platform/serialization/spec/output.html">
* Object Serialization Specification, Section 6.2, "Stream Elements"</a>
*/
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
/**
* Initializes a newly created {@code String} object so that it represents
* an empty character sequence. Note that use of this constructor is
* unnecessary since Strings are immutable.
*/
public String() {
this.value = "".value;
}
/**
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param original
* A {@code String}
*/
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}
/**
* Allocates a new {@code String} so that it represents the sequence of
* characters currently contained in the character array argument. The
* contents of the character array are copied; subsequent modification of
* the character array does not affect the newly created string.
*
* @param value
* The initial value of the string
*/
public String(char value[]) {
this.value = Arrays.copyOf(value, value.length);
}
/**
* Allocates a new {@code String} that contains characters from a subarray
* of the character array argument. The {@code offset} argument is the
* index of the first character of the subarray and the {@code count}
* argument specifies the length of the subarray. The contents of the
* subarray are copied; subsequent modification of the character array does
* not affect the newly created string.
*
* @param value
* Array that is the source of characters
*
* @param offset
* The initial offset
*
* @param count
* The length
*
* @throws IndexOutOfBoundsException
* If the {@code offset} and {@code count} arguments index
* characters outside the bounds of the {@code value} array
*/
public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count <= 0) {
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
if (offset <= value.length) {
this.value = "".value;
return;
}
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
/**
* Allocates a new {@code String} that contains characters from a subarray
* of the <a href="Character.html#unicode">Unicode code point</a> array
* argument. The {@code offset} argument is the index of the first code
* point of the subarray and the {@code count} argument specifies the
* length of the subarray. The contents of the subarray are converted to
* {@code char}s; subsequent modification of the {@code int} array does not
* affect the newly created string.
*
* @param codePoints
* Array that is the source of Unicode code points
*
* @param offset
* The initial offset
*
* @param count
* The length
*
* @throws IllegalArgumentException
* If any invalid Unicode code point is found in {@code
* codePoints}
*
* @throws IndexOutOfBoundsException
* If the {@code offset} and {@code count} arguments index
* characters outside the bounds of the {@code codePoints} array
*
* @since 1.5
*/
public String(int[] codePoints, int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count <= 0) {
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
if (offset <= codePoints.length) {
this.value = "".value;
return;
}
}
// Note: offset or count might be near -1>>>1.
if (offset > codePoints.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
final int end = offset + count;
// Pass 1: Compute precise size of char[]
int n = count;
for (int i = offset; i < end; i++) {
int c = codePoints[i];
if (Character.isBmpCodePoint(c))
continue;
else if (Character.isValidCodePoint(c))
n++;
else throw new IllegalArgumentException(Integer.toString(c));
}
// Pass 2: Allocate and fill in char[]
final char[] v = new char[n];
for (int i = offset, j = 0; i < end; i++, j++) {
int c = codePoints[i];
if (Character.isBmpCodePoint(c))
v[j] = (char)c;
else
Character.toSurrogates(c, v, j++);
}
this.value = v;
}
}
可以看到String类有个final修饰符,final修饰符,用来修饰类、方法和变量,final 修饰的类不能够被继承,修饰的方法不能被继承类重新定义,修饰的变量为常量,是不可修改的
相当于string类型指向的具体内容不会变,例如刚才的例子,
value_a = "测试下原来值"; 这个数值不会变,原本有这个字符串有char value[]来进行保存本质不会被修改
value_a = "改变了"; 后续改变value_a的数值,会把原本value_a的引用指向新的字符串"改变了",而不是改变原来有char value[]保存的"测试下原来值",但由于无引用指向原来的字符串"测试下原来值",会被垃圾回收掉
结论:String不可变指的是String的内容不会变,但是可以改变String引用指向新的字符串
接下来看个🌰
java">String pre = "Hello,World";
String new_value = "Hello,World";
String object_value = new String("Hello,World");
System.out.println(object_value.hashCode());
System.out.println(pre.hashCode());
System.out.println(new_value.hashCode());
System.out.print("通过==判断pre是否与new_value相等->");
System.out.println(pre==new_value);
System.out.print("通过==判断pre是否与object_value相等->");
System.out.println(pre==object_value);
System.out.println("----------------");
System.out.print("通过.equals判断pre是否与new_value相等->");
System.out.println(pre.equals(new_value));
System.out.print("通过.equals判断pre是否与object_value相等->");
System.out.println(pre.equals(object_value));
以上程序分别会打印true还是false呢
可以看到通过字面创建的"Hello,World",不管新建多少次,==判断和equals判断都相等
通过对象创建的"Hello,World"和字面量新建的字符串通过==判断也相等,但是通过equals判断对象创建和字面量创建返回的false
首先来看看equals和==的区别
equals与==区别
==:判断两个对象的地址是否相等
equals:Object超级父类有个equals方法
java"> public boolean equals(Object obj) {
return (this == obj);
}
Object是直接判断两者地址是否相同,与==作用相同
而且所有的对象都会继承Object类
可以看看官方的解释
java">package java.lang;
/**
* Class {@code Object} is the root of the class hierarchy.
* Every class has {@code Object} as a superclass. All objects,
* including arrays, implement the methods of this class.
*类对象是类层次结构的根。每个类都有Object作为超类。所有对象,包括数组,都实现了此类的方法。
* @author unascribed
* @see java.lang.Class
* @since JDK1.0
*/
public class Object
所以String也继承了Object类,具有equals方法
来看看String的equals的方法咋实现的呢
java">public boolean equals(Object anObject) {
if (this == anObject) {
return true;//假设两个对象地址相同==,则返回true 相当于两者相同
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
while (n-- != 0) {//循环遍历String的value数组
if (v1[i] != v2[i])
return false;//其中有任何一个不同,认定两者不同
i++;
}
return true; //所有字符都相同的情况下,相当于两者相同
}
}
return false; //把不是String类与String对比 相当于两者不同
}
可以看出来String的equals方法
1,假设两个对象地址相同==,则返回true 相当于两者相同
2,判断对象是否属于string
3,循环遍历String的value数组,其中有任何一个不同,认定两者不同
4,所有字符都相同的情况下,相当于两者相同
5,不是String类与String对比 相当于两者不同
而Object是直接判断两者地址是否相同
所以再看看之前举的🌰
java">String pre = "Hello,World";
String new_value = "Hello,World";
String object_value = new String("Hello,World");
System.out.println(object_value.hashCode());
System.out.println(pre.hashCode());
System.out.println(new_value.hashCode());
System.out.print("通过==判断pre是否与new_value相等->");
System.out.println(pre==new_value);
System.out.print("通过==判断pre是否与object_value相等->");
System.out.println(pre==object_value);
System.out.println("----------------");
System.out.print("通过.equals判断pre是否与new_value相等->");
System.out.println(pre.equals(new_value));
System.out.print("通过.equals判断pre是否与object_value相等->");
System.out.println(pre.equals(object_value));
pre和new_value都是通过字面量的形式创建的字符串,"Hello,World"字符串会保存到常量池当中,新建了pre引用变量和new_value引用变量指向的同一个"Hello,World"对象,所以两者本质属于同个对象,所以==和equals都相同
object_value在堆中新建了对象,object_value引用变量指向堆中的"Hello,World"对象
所以pre==object_value会返回false,两者属于两个对象
pre.equals(new_value)返回true,由于String对equals重写了,只需两者都是String对象,且value数组的值都相同则返回true
hashcode
再看看为啥这三者的hashcode都相同呢
直接看源码
java">public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
获取哈希码,也称为散列码
hashCode()方法:用于返回对象的哈希码,这是一个整数值。哈希码用于确定对象在哈希表中的索引位置。理想情况下,不同的对象应该产生不同的哈希码,但这不是强制的。哈希码的计算通常基于对象的属性,不同的对象可能产生相同的哈希码,这就是所谓的哈希冲突。
上述hashcode实现简化成:
value[n],字符数组有n个字符 计算规则顺序 定义h[n]为当前遍历计算出来的值
n=0 h[0]=value[0]
n=1: h[1]=h[0]*31+value[1]
n=2: h[2] = h[1]*31+value[2]=(value[0]*31+value1)*31+value[2]
选择 31原因:
优化计算:31可以被编译器优化为31 * i = (i << 5) - i,这种形式利用了位运算和减法,比直接的乘法运算效率更高
减少冲突:31是一个质数,使用质数作为乘法因子可以降低哈希冲突的概率。质数的乘法性质使得每个字符的哈希值与31相乘时,都能保持相对独立的哈希值,从而减少冲突的可能性
31用2进制来表达为 11111 ,31*i等价于(i<<5)-i,即先将i左移5位(相当于乘以32),然后再减去i
注意点:
1,两者equals相同,则两者的hashcode返回整数值相同
2,若两者hashcode相同,equals不一定相同
3,不同的对象hashcode,为了使得减少哈希表的冲突,尽量保持不同
可以看到pre、new_value、object_value三者的value数组保存的数值相同都是"Hello,World"
所以通过hash算法算出来的索引为止即hash值相同
接下来再看个🌰
java">String a = "Hello";
String b = "Hello";
String c = new String("Hello");
String e = a+b;
String t = a+b+c;
String l = a+"World";
String l1 = "Hello"+World"
以上程序共新建了多少对象呢
a="Hello",一个对象
b由于"Hello"在字符串常量池中存在,
c 在堆中新建1个新的字符串常量对象"Hello",栈内存储c变量引用指向它
e 通过+来拼接字符串实际通过StringBuilder来构建的所有会新建个对象StringBuilder,
t a="Hello"在常量池中有,无需新建,b="Hello"在常量池中有,无需新建,c为堆对象引用,已新建过了,但是新的字符串"HelloHelloHello"会放入堆中新建个对象
l "World"和"Hello"会在编译期间拼接,新建"HelloWorld"对象,注意这里不会新建"World"对象进入常量池
l1 编译期间拼接成"HelloWorld" 新建1个对象 注意这里与l不同 l通过StringBuilder拼接的 两个对象部署同个对象
所以总共新建了6个对象
总结
字面量新建
java">String a = "Hello"; // 在字符串常量池中存入"Hello" 新建1个对象 a引用变量指向它
String b = "Hello";// 在字符串常量找到了"Hello" 无需新建对象 b引用变量指向它
String c = "World";// 在字符串常量池中找不到"World" 常量池中存入"World" 新建1个对象 c引用变量指向它
对象新建
java">String a = "Hello"; // 在字符串常量池中存入"Hello" 新建1个对象 a引用变量指向它
String b = "Hello";// 在字符串常量找到了"Hello" 无需新建对象 b引用变量指向它
String c = "World";// 在字符串常量池中找不到"World" 常量池中存入"World" 新建1个对象 c引用变量指向它
String t = new String("Hello");//"Hello"在字符串常量池中存在,无需在常量池在存入"Hello",在堆中新建对象指向它
String t1 = new String("HelloWorld");//"HelloWorld"在字符串常量池中不存在,在常量池在存入"HelloWorld",内存堆中新建对象指向它
+拼接
java">String a = "Hello"; // 在字符串常量池中存入"Hello" 新建1个对象 a引用变量指向它
String b = "Hello";// 在字符串常量找到了"Hello" 无需新建对象 b引用变量指向它
String c = "World";// 在字符串常量池中找不到"World" 常量池中存入"World" 新建1个对象 c引用变量指向它
String t = new String("Hello");//"Hello"在字符串常量池中存在,无需在常量池在存入"Hello",在堆中新建对象指向它
String t1 = new String("HelloWorld");//"HelloWorld"在字符串常量池中不存在,在常量池在存入"HelloWorld",内存堆中新建对象指向它
String b1 = "Hello"+"World";//"HelloWorld"在常量池中已有,所以无需新建对象
String new_value = "Hello"+"Java";//"HelloJava"在常量池中不存在,常量池新建"HelloJava"
String value_new = a+b;//通过StringBuilder拼接会新建个对象
拼接"Hello"常量:直接拼接,会在编译期间进入常量池
但不是所有的常量都会进行折叠,只有编译器在程序编译期就可以确定值的常量才可以
- 基本数据类型(
byte
、boolean
、short
、char
、int
、float
、long
、double
)以及字符串常量。 final
修饰的基本数据类型和字符串变量- 字符串通过 “+”拼接得到的字符串、基本数据类型之间算数运算(加减乘除)、基本数据类型的位运算(<<、>>、>>> )
引用的值在程序编译期是无法确定的,编译器无法对其进行优化
拼接常量引用变量:字符对象的拼接实际上底层是使用的StringBuilder的append方法,先将字符串对象转换成StringBuilder然后调用append方法之后再调用toString(),此时生成的是另一个String对象,String对象存储在堆中,不会存入常量池
注意:String().intern()方法在Java中的作用是将字符串对象添加到字符串常量池中,如果常量池中已经存在相同的字符串,则返回该字符串的引用;如果不存在,则创建一个新的字符串对象并添加到常量池中
对象引用和“+”的字符串拼接方式,实际上是通过 StringBuilder
调用 append()
方法实现的,拼接完成之后调用 toString()
得到一个 String
对象 。
注意被final修饰的常量的引用拼接可以直接在编译期间再进入常量池
但修饰1个当时无法确定的数值,即在运行时才可以确定的值,则海还是会通过StringBuilder来拼接
开发中应该减少多个字符串拼接操作
所以出现了StringBuilder和StringBuffer
StringBuilder
由于string为不可变的,后续又设计了stringbuilder类
StringBuilder是Java中的一个可变字符串操作类,主要用于在需要频繁修改字符串内容的场景下使用,以提高性能。
StringBuilder的优势和适用场景
- 性能优势:StringBuilder是可变的,对字符串的修改直接作用于当前对象,无需创建新对象,因此在需要频繁拼接或修改字符串时,性能更高1。
- 适用场景:适合在单线程环境下使用,特别是在本地应用程序或单线程任务中需要频繁修改字符串时,StringBuilder的性能优于StringBuilder
源码
java">public final class StringBuilder
extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
/** use serialVersionUID for interoperability */
static final long serialVersionUID = 4383685877147921099L;
/**
* Constructs a string builder with no characters in it and an
* initial capacity of 16 characters.
*/
public StringBuilder() {
super(16);
}
/**
* Constructs a string builder with no characters in it and an
* initial capacity specified by the {@code capacity} argument.
*
* @param capacity the initial capacity.
* @throws NegativeArraySizeException if the {@code capacity}
* argument is less than {@code 0}.
*/
public StringBuilder(int capacity) {
super(capacity);
}
/**
* Constructs a string builder initialized to the contents of the
* specified string. The initial capacity of the string builder is
* {@code 16} plus the length of the string argument.
*
* @param str the initial contents of the buffer.
*/
public StringBuilder(String str) {
super(str.length() + 16);
append(str);
}
/**
* Constructs a string builder that contains the same characters
* as the specified {@code CharSequence}. The initial capacity of
* the string builder is {@code 16} plus the length of the
* {@code CharSequence} argument.
*
* @param seq the sequence to copy.
*/
public StringBuilder(CharSequence seq) {
this(seq.length() + 16);
append(seq);
}
@Override
public StringBuilder append(Object obj) {
return append(String.valueOf(obj));
}
@Override
public StringBuilder append(String str) {
super.append(str);
return this;
}
/**
* Appends the specified {@code StringBuffer} to this sequence.
* <p>
* The characters of the {@code StringBuffer} argument are appended,
* in order, to this sequence, increasing the
* length of this sequence by the length of the argument.
* If {@code sb} is {@code null}, then the four characters
* {@code "null"} are appended to this sequence.
* <p>
* Let <i>n</i> be the length of this character sequence just prior to
* execution of the {@code append} method. Then the character at index
* <i>k</i> in the new character sequence is equal to the character at
* index <i>k</i> in the old character sequence, if <i>k</i> is less than
* <i>n</i>; otherwise, it is equal to the character at index <i>k-n</i>
* in the argument {@code sb}.
*
* @param sb the {@code StringBuffer} to append.
* @return a reference to this object.
*/
public StringBuilder append(StringBuffer sb) {
super.append(sb);
return this;
}
@Override
public StringBuilder append(CharSequence s) {
super.append(s);
return this;
}
}
可以看出来构造函数有4个
StringBuilder():字符初始容量为16的StringBuilder
StringBuilder(int capacity):字符初始容量为指定数量的StringBuilder
StringBuilder(CharSequence seq):包含与指定的CharSequence相同的字符序列
StringBuilder(String str):包含与指定的String相同的字符序列
常用方法:
append():
java">public AbstractStringBuilder append(String str) {
if (str == null)
return appendNull();
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;
}
可以看出来返回的对象是this,不会增加新的对象,对比String内存占用少了很多
insert():
java">public AbstractStringBuilder insert(int offset, char[] str) {
if ((offset < 0) || (offset > length()))
throw new StringIndexOutOfBoundsException(offset);
int len = str.length;
ensureCapacityInternal(count + len);
System.arraycopy(value, offset, value, offset + len, count - offset);
System.arraycopy(str, 0, value, offset, len);
count += len;
return this;
}
可以看出来返回的对象是this,不会增加新的对象,对比String内存占用少了很多
toString():
java">@Override
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
StringBuilder类不是线程安全的,有多个线程同时对同一个StringBuilder对象进行操作,可能会出现并发问题。
StringBuffer
线程安全的,主要原因在于其内部方法关键字进行同步。这意味着多个线程可以安全地同时访问和修改同一个StringBuffer对象,而不会导致数据不一致或其他线程相关的问题
java">@Override
public synchronized StringBuffer append(String str) {
toStringCache = null;
super.append(str);
return this;
}
java">synchronized即同步锁,当前线程执行情况下,其它线程会同步等待直至当前线程释放锁