golang 中你应该知道的slice知识

7月 21 2024 golang 17 分钟读完 (约 2577 字)

When you discover your mission, you will feel its demand. It will fill you with enthusiasm and a burning desire to get to work on it.
— W. Clement Stone

1. 数组 vs 切片

1.1 声明及初始化

数组是同一种数据类型元素的集合，数组在定义时需要指定长度和元素类型,不能动态扩容，在编译期就会确定大小。

func main() {
    var arrays [3]int   // 声明并初始化为默认零值
    var arrays1 = [4]int{1, 2, 3, 4}  // 声明同时初始化
    var arrays2 = [...]int{1, 2, 3, 4, 5} // ...可以表示后面初始化值的长度
    fmt.Println(arrays)    // [0 0 0]
    fmt.Println(arrays1)   // [1 2 3 4]
    fmt.Println(arrays2)   // [1 2 3 4 5]
}

数组的使用场景相对有限，切片才更加常用。切片（Slice）是一个拥有相同类型元素的可变长度的序列。它是基于数组类型做的一层封装。它非常灵活，支持自动扩容。它的结构如下：

type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

func main() {
	var slice []int                       // 直接声明
	fmt.Println(len(slice), cap(slice))   // 0 0
	slice1 := []int{1, 2, 3, 4}           // 字面量方式
	fmt.Println(len(slice1), cap(slice1)) // 4 4
	slice2 := make([]int, 3, 5)           // 使用make()函数构造切片
	fmt.Println(len(slice2), cap(slice2)) // 3 5

	slice3 := append(slice1, 1)
	fmt.Println(len(slice1), cap(slice1)) // 4 4
	fmt.Println(len(slice3), cap(slice3)) // 5 8
	slice4 := slice3[1:5]
	fmt.Println(len(slice4), cap(slice4), slice4) // 4 7 [2 3 4 1]
}

也许你对slice3 和slice4的输出结果很好奇，这涉及到扩容策略，往下看你会知道答案。

1.2 函数参数

Go语言中只有值拷贝，所以如果你将数组传递给函数，在函数中修改数组的元素是不会影响到原始数组的。

但是slice不一样，上文介绍了slice的struct，当你将 slice 传递给函数，本质上传递的是数组指针的拷贝，以及len和cap，这里的指针可能会指向同一个数组，所以在函数中修改数组的元素是可能会影响到原始数组的。

func modifySlice(s []string) {
	s[0] = "tfrain"
	s[1] = "github"
	fmt.Println("modifySlice slice: ", s)
}

func main() {
	s := []string{"wesleywei", "medium"}
	fmt.Println("main slice: ", s)
	modifySlice(s)
	fmt.Println("main slice: ", s)
}
// main slice:  [wesleywei medium]
// modifySlice slice:  [tfrain github]
// main slice:  [tfrain github]

当然，这里我用的词是可能，而且举的例子是影响到原始数组的例子。但如果你在函数内改变了其底层数组的指针，例如扩容、copy等，它将不会再影响外部的原始数组。

下文会对扩容场景进行介绍，从而区分这种情况，请继续看。有些复杂？或许这就是slice更灵活、应用场景更广的代价。

2. 拷贝大切片 or 拷贝小切片

Go语言中只有值传递，结构也在上文提到过：

type slice struct {
    array unsafe.Pointer
    len   int
    cap   int
}

如果发生拷贝，本质上就是拷贝上面的三个字段。大切片跟小切片的区别无非就是 len 和 cap的值比小切片的这两个值大一些，所以代价是类似的。

3. 切片的深浅拷贝

func main() {
	slice1 := []int{1, 2, 3, 4}
	arrayPtr1 := (*int)(unsafe.Pointer(&slice1[0]))
	fmt.Printf("The address of the underlying array of slice1: %p\n", arrayPtr1)

	slice2 := slice1
	arrayPtr2 := (*int)(unsafe.Pointer(&slice2[0]))
	fmt.Printf("The address of the underlying array of slice2: %p\n", arrayPtr2)

	slice3 := slice2[:]
	arrayPtr3 := (*int)(unsafe.Pointer(&slice3[0]))
	fmt.Printf("The address of the underlying array of slice3: %p\n", arrayPtr3)

	slice4 := make([]int, len(slice3))
	copy(slice4, slice3)
	arrayPtr4 := (*int)(unsafe.Pointer(&slice4[0]))
	fmt.Printf("The address of the underlying array of slice4: %p\n", arrayPtr4)
}

// The address of the underlying array of slice1: 0xc00007a000
// The address of the underlying array of slice2: 0xc00007a000
// The address of the underlying array of slice3: 0xc00007a000
// The address of the underlying array of slice4: 0xc00007a020

这里举的例子可以看出：

使用 := or =操作符拷贝切片，这种就是浅拷贝
使用[:]下标的方式复制切片，这种也是浅拷贝
使用Go语言的内置函数copy()进行切片拷贝，这种就是深拷贝

Leetcode 47可以帮助你理解copy的用处：

func permute(nums []int) [][]int {
	var res [][]int
	n := len(nums)
	visted := make([]bool, n)
	var build func(subs []int)
	build = func(subs []int) {
		if len(subs) == n {
			tmp := make([]int, n)
			copy(tmp, subs)
			res = append(res, tmp)
		}
		for i := 0; i < n; i++ {
			if visted[i] {
				continue
			}
			visted[i] = true
			build(append(subs, nums[i]))
			visted[i] = false
		}
	}
	build(nil)
	return res
}

4. 切片的扩容策略

go1.20 slice

...
	newcap := oldCap
	doublecap := newcap + newcap
	if newLen > doublecap {
		newcap = newLen
	} else {
		const threshold = 256
		if oldCap < threshold {
			newcap = doublecap
		} else {
			// Check 0 < newcap to detect overflow
			// and prevent an infinite loop.
			for 0 < newcap && newcap < newLen {
				// Transition from growing 2x for small slices
				// to growing 1.25x for large slices. This formula
				// gives a smooth-ish transition between the two.
				newcap += (newcap + 3*threshold) / 4
			}
			// Set newcap to the requested cap when
			// the newcap calculation overflowed.
			if newcap <= 0 {
				newcap = newLen
			}
		}
	}
...

新 slice 的容量是要大于等于老 slice 容量的 2倍或者1.25倍，当原 slice 容量小于 256 的时候，新 slice 容量变成原来的 2 倍；原 slice 容量超过 256，新 slice 容量变成原来的1.25倍。
在源代码的后面部分，切片在扩容时会进行内存对齐，这个和内存分配策略相关，较为复杂，暂且忽略。
扩容后，slice底层数组便发生了改变

5. 空切片、nil切片、零切片

package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

func main() {
	// nil slice
	var nilSlice []int

	// empty slice
	emptySlice := []int{}
	emptySlice2 := make([]int, 0)

	fmt.Printf("nilSlice: len=%d, cap=%d, is nil: %t\n", len(nilSlice), cap(nilSlice), nilSlice == nil)
	fmt.Printf("emptySlice: len=%d, cap=%d, is nil: %t\n", len(emptySlice), cap(emptySlice), emptySlice == nil)
	fmt.Printf("emptySlice2: len=%d, cap=%d, is nil: %t\n", len(emptySlice2), cap(emptySlice2), emptySlice2 == nil)

	nilSliceHeader := (*reflect.SliceHeader)(unsafe.Pointer(&nilSlice))
	emptySliceHeader := (*reflect.SliceHeader)(unsafe.Pointer(&emptySlice))
	emptySlice2Header := (*reflect.SliceHeader)(unsafe.Pointer(&emptySlice2))

	fmt.Printf("Pointer of nilSlice: %p\n", unsafe.Pointer(nilSliceHeader.Data))
	fmt.Printf("Pointer of emptySlice: %p\n", unsafe.Pointer(emptySliceHeader.Data))
	fmt.Printf("Pointer of emptySlice2: %p\n", unsafe.Pointer(emptySlice2Header.Data))

	if emptySliceHeader.Data == emptySlice2Header.Data {
		fmt.Println("emptySlice and emptySlice2 point to the same zerobase address.")
	} else {
		fmt.Println("emptySlice and emptySlice2 do not point to the same zerobase address.")
	}
}

// nilSlice: len=0, cap=0, is nil: true
// emptySlice: len=0, cap=0, is nil: false
// emptySlice2: len=0, cap=0, is nil: false
// Pointer of nilSlice: 0x0
// Pointer of emptySlice: 0x58f360
// Pointer of emptySlice2: 0x58f360
// emptySlice and emptySlice2 point to the same zerobase address.

nil切片的长度和容量都为0，并且和nil比较的结果为true
空切片的长度和容量也都为0，但是和nil的比较结果为false，因为所有的空切片的数据指针都指向 zerobase 的地址

1 2	// base address for all 0-byte allocations var zerobase uintptr

go1.20 zerobase

1 2	slice := make([]int,5) // 0 0 0 0 0 slice := make([]*int,5) // nil nil nil nil nil

零切片的内部数组的元素都是零值或者底层数组的内容就全是 nil的，使用make创建的、长度、容量都不为0的切片就是零值切片

6. 参数传递切片 vs 参数传递切片指针

先看例子：

package main

import (
	"fmt"
	"unsafe"
)

func modifySlice(s []int) {
	s[0] = 100
	s = append(s, 200)
	fmt.Println("Inside modifySlice (modified slice):", s)
	arrayPtr := (*int)(unsafe.Pointer(&s[0]))
	fmt.Printf("The address of the underlying array in modifySlice: %p\n", arrayPtr)
}

func modifySlicePointer(s *[]int) {
	(*s)[0] = 100
	*s = append(*s, 200)
	fmt.Println("Inside modifySlicePointer (modified slice pointer):", *s)
	arrayPtr := (*int)(unsafe.Pointer(&(*s)[0]))
	fmt.Printf("The address of the underlying array in modifySlicePointer: %p\n", arrayPtr)
}

func main() {
	originalSlice := make([]int, 3, 4)
	originalSlice[0], originalSlice[1], originalSlice[2] = 0, 1, 2
	// originalSlice := []int{0, 1, 2}

	fmt.Println("Original slice before modifySlice:", originalSlice)
	arrayPtr := (*int)(unsafe.Pointer(&originalSlice[0]))
	fmt.Printf("The address of the underlying array before modifySlice: %p\n", arrayPtr)

	modifySlice(originalSlice)
	fmt.Println("Original slice after modifySlice:", originalSlice)
	arrayPtr = (*int)(unsafe.Pointer(&originalSlice[0]))
	fmt.Printf("The address of the underlying array after modifySlice: %p\n", arrayPtr)

	modifySlicePointer(&originalSlice)
	fmt.Println("Original slice after modifySlicePointer:", originalSlice)
	arrayPtr = (*int)(unsafe.Pointer(&originalSlice[0]))
	fmt.Printf("The address of the underlying array after modifySlicePointer: %p\n", arrayPtr)
}

Run code In Go1.22

如果originalSlice为：

1 2	originalSlice := make([]int, 3, 4) originalSlice[0], originalSlice[1], originalSlice[2] = 0, 1, 2

结果为：

Original slice before modifySlice: [0 1 2]
The address of the underlying array before modifySlice: 0xc000126000
Inside modifySlice (modified slice): [100 1 2 200]
The address of the underlying array in modifySlice: 0xc000126000
Original slice after modifySlice: [100 1 2]
The address of the underlying array after modifySlice: 0xc000126000
Inside modifySlicePointer (modified slice pointer): [100 1 2 200]
The address of the underlying array in modifySlicePointer: 0xc000126000
Original slice after modifySlicePointer: [100 1 2 200]
The address of the underlying array after modifySlicePointer: 0xc000126000

这里你可能只有一个疑问点，为什么 Original slice after modifySlice 的结果是 [100 1 2]，而不是[100 1 2 200]，这是因为其 len 为 3，所以只输出了三个数字。

如果originalSlice为：

1	originalSlice := []int{0, 1, 2}

结果为：

Original slice before modifySlice: [0 1 2]
The address of the underlying array before modifySlice: 0xc0000ac000
Inside modifySlice (modified slice): [100 1 2 200]
The address of the underlying array in modifySlice: 0xc0000b2030
Original slice after modifySlice: [100 1 2]
The address of the underlying array after modifySlice: 0xc0000ac000
Inside modifySlicePointer (modified slice pointer): [100 1 2 200]
The address of the underlying array in modifySlicePointer: 0xc0000b2060
Original slice after modifySlicePointer: [100 1 2 200]
The address of the underlying array after modifySlicePointer: 0xc0000b2060

这种情况显然更为复杂，你可以运行代码，自己先进行一轮思考。
好吧，这个例子有几个可疑的点，需要我们注意：

0xc0000b2030 不等于 0xc0000ac000？这是因为发生了扩容，导致了底层数组地址发生了变化，main函数中的结果是[100 1 2]，是因为len为3。
0xc0000b2060 不等于 0xc0000ac000？这是因为扩容导致，好的，这里没问题。
但是为什么main 函数中的地址也是 0xc0000b2060呢？而且输出的值是[100 1 2 200]？这是因为我们传递的值，是变量originalSlice的地址的一份拷贝，在函数中我们将地址指向了新的slice，所以这个影响是全局的。

7. `range`遍历切片

func main() {
	u := []user{
		{"wesley", "medium"},
		{"tfrain", "github"},
	}
	n := make([]*user, 0, len(u))
	for _, v := range u {
		fmt.Printf("%p\n", &v)
		n = append(n, &v)
	}
	fmt.Println(n)
	for _, v := range n {
		fmt.Println(v)
	}
}
// print before go 1.22
// 0xc000060020
// 0xc000060020
// [0xc000060020 0xc000060020]
// &{tfrain github}
// &{tfrain github}

// print after go 1.22
// 0xc000098020
// 0xc000098040
// [0xc000098020 0xc000098040]
// &{wesley medium}
// &{tfrain github}

在 Go 1.22 之前，使用range遍历切片u，变量v的地址不会发生变化，如例子中一直是0xc000060020，所以拷贝后输出的结果是不符合直觉的。当然这个不符合直觉问题在Go 1.22 已经被修复了，参考：Fixing For Loops in Go 1.22 - The Go Programming Language

8. 总结

如果你想要掌握slice的基本使用，了解三点即可：

了解 slice 的 len 和cap 使用
了解 go 的值拷贝
了解 slice 底层指向一个数组，并且它有扩容策略。

9. 参考

Go Slices: usage and internals - The Go Programming Language

更多该系列文章，参考medium链接:

https://wesley-wei.medium.com/list/you-should-know-in-golang-e9491363cd9a

English post: https://programmerscareer.com/golang-slice/
作者：Wesley Wei – Twitter Wesley Wei – Medium
注意：原文在 2024-07-21 01:11 时创作于 https://programmerscareer.com/golang-slice/. 本文为作者原创，转载请注明出处。

#code

golang 中你应该知道的slice知识

1. 数组 vs 切片

1.1 声明及初始化

1.2 函数参数

2. 拷贝大切片 or 拷贝小切片

3. 切片的深浅拷贝

4. 切片的扩容策略

5. 空切片、nil切片、零切片

6. 参数传递切片 vs 参数传递切片指针

7. `range`遍历切片

8. 总结

9. 参考

评论

Your browser is out-of-date!

golang 中你应该知道的slice知识

1. 数组 vs 切片

1.1 声明及初始化

1.2 函数参数

2. 拷贝大切片 or 拷贝小切片

3. 切片的深浅拷贝

4. 切片的扩容策略

5. 空切片、nil切片、零切片

6. 参数传递切片 vs 参数传递切片指针

7. range遍历切片

8. 总结

9. 参考

评论

Your browser is out-of-date!

7. `range`遍历切片