c python 原始碼解析 投影片
DESCRIPTION
這是談論有關 CPython 運作時,背後的資料結構如何實做出來的。TRANSCRIPT
![Page 1: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/1.jpg)
CPython 原始碼解析
果凍
http://goo.gl/3mq3Y
![Page 2: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/2.jpg)
簡介
● 中興大學資工系
● 任職於曼克斯
● 接觸 python 時間有七年
● 喜歡學習新的程式語言
● C、C++、java、golang。● About me: http://about.
me/ya790206
![Page 3: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/3.jpg)
大綱
1. 介紹 C 語言如何模擬繼承2. 介紹 python 的根本物件 PyObject3. 介紹 PyType4. 介紹 PyIntObject5. 介紹 PyStringObject6. 介紹 PyList
https://github.com/ya790206/CPython
![Page 4: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/4.jpg)
ptr->data物件起始位址 data屬性
離物件起始位址置的差
data屬性所在位址
![Page 5: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/5.jpg)
#include <stdio.h>#include <stdint.h>
struct classA{ int32_t data; };
struct classB{ int8_t data[4];};
struct classC{ int32_t data; int32_t data1;};
![Page 6: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/6.jpg)
int main(){ struct classA obj; struct classA *pa = &obj; struct classB *pb = (struct classB*)&obj; struct classC *pc = (struct classC*)&obj; obj.data = 0; printf("%d, %d, %d, %d, %d, %d\n", pa->data, pb->data[0], pb->data[1], pb->data[2], pb->data[3], pc->data); obj.data = 1; printf("%d, %d, %d, %d, %d, %d\n", pa->data, pb->data[0], pb->data[1], pb->data[2], pb->data[3], pc->data); obj.data = 1 << 8; printf("%d, %d, %d, %d, %d, %d\n", pa->data, pb->data[0], pb->data[1], pb->data[2], pb->data[3], pc->data);
printf("%p %p\n",&(pb->data[1]) - &(((struct classB*)0)->data[1]), pb); return 0;}
![Page 7: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/7.jpg)
0, 0, 0, 0, 0, 01, 1, 0, 0, 0, 1256, 0, 1, 0, 0, 2560x7fff613428a0 0x7fff613428a0
![Page 8: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/8.jpg)
Little-Endian
classA
classB
classC
datadata
data[0]
data[1]
data[2]
data[3]
data data1
obj
32
![Page 9: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/9.jpg)
#include <stdio.h>typedef void (*myfunc)();#define Father_HEADER \ myfunc init;
struct father{ Father_HEADER};struct child1{ Father_HEADER myfunc custom1;};struct child2{ Father_HEADER myfunc custom2;};
C 語言模仿繼承方法
![Page 10: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/10.jpg)
void father_init(){ printf("call father init\n");}void child1_init(){ printf("call child1 init\n");}void child2_init(){ printf("call child2 init\n");}void call_init(struct father *obj){ obj->init();}
![Page 11: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/11.jpg)
int main(){ struct father f_obj = {father_init}; struct child1 c1_obj = {child1_init, 0}; struct child2 c2_obj = {child2_init, 0};
call_init(&f_obj); call_init((struct father*) &c1_obj); call_init((struct father*) &c2_obj); return 0;}
![Page 12: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/12.jpg)
結果
call father initcall child1 initcall child2 init
![Page 13: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/13.jpg)
object.h
Python 物件根本:1. PyObject
a. 如 intb. PyObject_HEAD
2. PyVarObjecta. 多了 ob_size 欄位b. 如 string, listc. PyObject_VAR_HEAD
ref: http://docs.python.org/2/c-api/structures.html
![Page 14: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/14.jpg)
#define PyObject_HEAD \ _PyObject_HEAD_EXTRA \ Py_ssize_t ob_refcnt; \ struct _typeobject *ob_type;
#define PyObject_VAR_HEAD \ PyObject_HEAD \ Py_ssize_t ob_size; /* Number of items in variable part */
![Page 15: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/15.jpg)
typedef struct _object { PyObject_HEAD} PyObject;
typedef struct { PyObject_VAR_HEAD} PyVarObject;
![Page 16: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/16.jpg)
ob_refcnt:1. Reference Counting
#define Py_INCREF(op) ( \ _Py_INC_REFTOTAL _Py_REF_DEBUG_COMMA \ ((PyObject*)(op))->ob_refcnt++)
ob_type: a. 該物件的 type,該物件能作的動作b. PyType_Type 物件的此欄位指向自己c. 其他屬於 PyTypeObject 的物件此欄位指向
PyType_Type 物件d. 其他物件則指向他所屬的 PyTypeObject 物件
![Page 17: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/17.jpg)
PyIntObject
ob_type
PyType_Type
ob_typePyInt_Type
ob_type
ob_type
![Page 18: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/18.jpg)
PyObject
1. PyClassObject2. PyInstanceObject3. PyMethodObject4. PyCodeObject5. Py_complex6. PyDictObject7. PyFileObject8. PyFunctionObject9. PyIntObject
10. PySetObject等等
![Page 19: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/19.jpg)
PyVarObject
1. PyByteArrayObject2. PyFrameObject3. PyListObject4. PyStringObject5. PyTupleObject
![Page 20: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/20.jpg)
PyTypeObject
1. 存放該物件可以被執行的方法
2. 如 PyInt_Type 存放 Int 型別所支援的方法,他支援 tp_str,所以我們可以使用 str(5)。當我們呼叫 str(5),他會呼叫相對應的 c function, int_to_decimal_string。
3. 因為 tp_call 的值是 0,因此 int 型別不能被呼叫。tp_call 對應到 python 的 __call__ 方法。
![Page 21: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/21.jpg)
PyTypeObject
#define Py_TYPE(ob) (((PyObject*)(ob))->ob_type)
PyObject *v;Py_TYPE(v)->tp_free((PyObject *)v);
![Page 22: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/22.jpg)
PyIntObject
typedef struct { PyObject_HEAD long ob_ival;} PyIntObject;
![Page 23: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/23.jpg)
PyIntObject
1. PyInt_FromLong(long ival) 建立整數的函數2. 預設 CPython 實作,-5 ~ 256 的整數物件都
是 singletons。3. 使用 free_list 來減少沒必的 memory
allocate/deallocate。4. 每次向python memory system系統要求可容
納 N_INTOBJECTS 個整數的空間。5. 做連續 n 次加法時,會產生 n - 1 個暫時物
件。因 int_add 的回傳值是 PyObject。
![Page 24: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/24.jpg)
ob_refcnt ob_type
ob_refcnt ob_type
ob_refcnt ob_type
ob_refcnt ob_type
_intblock *next
PyIntObject objects[N_INTOBJECTS];
PyIntBlockfill_free_list 的原理
free_list
![Page 25: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/25.jpg)
1 2 3 4 5 6 7 8free list: 6 -> 7 -> 8如果在第三個位置的物件被刪除後
1 2 3 4 5 6 7 8free list: 3 -> 6 -> 7 -> 8
![Page 26: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/26.jpg)
PyString
typedef struct { PyObject_VAR_HEAD long ob_shash; int ob_sstate; char ob_sval[1];} PyStringObject;
![Page 27: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/27.jpg)
HEADER
ob_shash
ob_sstate
ob_sval[1]
空字串
\0
abc a b c \0
1. PyStringObject_SIZE + size2. ob_sval[1], ob_sval[2], ob_sval[3],反正 C不會檢查索引有沒有超過陣列大小
![Page 28: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/28.jpg)
PyString
1. 對於特定字串,使用 intern 機制,增加物件重複使用率。但是並無增加太多效率。
2. 在CPython中,一個 byte的字串和空字串是 singletons 物件。
![Page 29: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/29.jpg)
PyObject *PyString_InternFromString(const char *cp){ PyObject *s = PyString_FromString(cp); if (s == NULL) return NULL; PyString_InternInPlace(&s); return s;}
![Page 30: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/30.jpg)
PyString_InternInPlace(PyObject **p)
1. 檢查是否是字串,(不是字串)或(是NULL),則離開。
2. 如果是字串子類別,則離開。3. 如果已經是 intern 字串,則離開4. 如果 intern dict 中有相同字串,則將原本的字
串參考計數減1,傳回 intern 字串(傳位址)5. 如果字串不在 intern dict 裡,則把自己插入到
intern dict 裡。6. 把自己的參考計數減27. 把字串狀態設定成
SSTATE_INTERNED_MORTAL
![Page 31: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/31.jpg)
why Py_REFCNT(s) -= 2?
1. 因為當 intern dict 的 key ,被加一
2. 又當 intern dict 的 value ,再被加一
3. 這兩個參考計數只被 intern dict使用,如果不
減2,則永遠不會被消滅。(至少有 intern dict 的 key/value 指向他)
![Page 32: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/32.jpg)
哪些會呼叫PyString_InternInPlace
1. 字串長度小於等於12. 呼叫 PyString_InternFromString3. 使用者呼叫 intern(對應 C 的 builtin_intern)
![Page 33: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/33.jpg)
string_concat op = (PyStringObject *)PyObject_MALLOC(PyStringObject_SIZE + size); PyObject_INIT_VAR(op, &PyString_Type, size); op->ob_shash = -1; op->ob_sstate = SSTATE_NOT_INTERNED; Py_MEMCPY(op->ob_sval, a->ob_sval, Py_SIZE(a)); Py_MEMCPY(op->ob_sval + Py_SIZE(a), b->ob_sval, Py_SIZE(b)); op->ob_sval[size] = '\0'; return (PyObject *) op;
1. 每次完成字串加法的動作後,傳回新的物件2. 每次取得記憶體空間,是使用
PyObject_MALLOC
![Page 34: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/34.jpg)
Python memory management
PyIntBlock
PyIntObject PyString
![Page 35: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/35.jpg)
string_join
/* Allocate result space. */ res = PyString_FromStringAndSize((char*)NULL, sz); if (res == NULL) { Py_DECREF(seq); return NULL; }
1. 只分配一次記憶體
![Page 36: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/36.jpg)
/* Catenate everything. */ p = PyString_AS_STRING(res); for (i = 0; i < seqlen; ++i) { size_t n; item = PySequence_Fast_GET_ITEM(seq, i); n = PyString_GET_SIZE(item); Py_MEMCPY(p, PyString_AS_STRING(item), n); p += n; if (i < seqlen - 1) { Py_MEMCPY(p, sep, seplen); p += seplen; } }
Py_DECREF(seq); return res;
![Page 37: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/37.jpg)
typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated;} PyListObject;
PyListObject
![Page 38: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/38.jpg)
why ob_item 是二維指標?
PyObject *
PyObject *
PyObject *
PyObject *
PyObject *
PyObject
PyObject
PyObject
![Page 39: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/39.jpg)
PyList_New
1. static PyListObject *free_list[PyList_MAXFREELIST];2. nbytes = size * sizeof(PyObject *);3. op = PyObject_GC_New(PyListObject, &PyList_Type);4. op->ob_item = (PyObject **) PyMem_MALLOC(nbytes);
1. op 用來維護 list 的相關資訊,如 ob_size, ob_refcnt2. op->ob_item 存放 list 裡元素所在位址。
![Page 40: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/40.jpg)
list_dealloc
1. Py_XDECREF(op->ob_item[i]);2. PyMem_FREE(op->ob_item);3. 二選一
a. free_list[numfree++] = op;b. Py_TYPE(op)->tp_free((PyObject *)op);
![Page 41: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/41.jpg)
app1(Append)
1. list_resize(self, n+1) 2. Py_INCREF(v);3. PyList_SET_ITEM(self, n, v);
![Page 42: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/42.jpg)
list_resize
1. new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);2. new_allocated += newsize;3. PyMem_RESIZE(items, PyObject *, new_allocated);4. self->ob_item = items;5. Py_SIZE(self) = newsize;6. self->allocated = new_allocated;
實際上,PyMem_RESIZE 最後呼叫 realloc。Python 的 list行為與 C++ 的 vector 相似
![Page 43: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/43.jpg)
#define PyList_GET_ITEM(op, i) (((PyListObject *)(op))->ob_item[i])
#define PyList_SET_ITEM(op, i, v) (((PyListObject *)(op))->ob_item[i] = (v))
#define PyList_GET_SIZE(op) Py_SIZE(op)
![Page 44: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/44.jpg)
ins1
if (where > n) where = n; items = self->ob_item; for (i = n; --i >= where; ) items[i+1] = items[i]; Py_INCREF(v); items[where] = v; return 0;
![Page 45: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/45.jpg)
參考資料
1. python c api2. Extending and Embedding the
Python Interpreter3. Python源码剖析4. python source code
![Page 46: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/46.jpg)
Q & A
![Page 47: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/47.jpg)
謝謝大家
The manx:http://www.themanxgroup.tw/
The manx production:http://lucky-lane.com/
![Page 48: C python 原始碼解析 投影片](https://reader031.vdocuments.net/reader031/viewer/2022020710/54b7752d4a795985568b46c0/html5/thumbnails/48.jpg)