Python进阶教程m7b–混合编程–C语言接口ctypes(1)

发表于2020年8月1日2021年10月14日作者桔子菌

内容目录

1、加载dll文件
2、基本数据类型
- 2.1 c_int，c_long
- 2.2 c_double
2.3 c_float,c_longlong
- 2.4 字符或字符串数据
3 数组类型

原文链接： http://www.juzicode.com/archives/825

我们知道在Python中可以用os.popen()或subprocess.run()等方法调用其他编程语言生成的可执行文件或者系统命令，但是这种方式是文件级的调用，只能等指令运行结束才能知道执行结果，灵活度不高。这篇文章介绍的则是API级(函数级)的调用，用到Python标准模块ctypes，ctypes提供了一种方法可以在Python代码中调用C语言形式的API，需要注意的是ctypes并不支持C++形式的API，特别是C++中的类、重载等高级特性。

在C语言中函数调用包含几个基本要素：1、函数名称；2、入参；3、返回值，在Python中也是一样包含了这3个基本要素。在ctypes中调用C语言的函数使用的函数名称就是C语言的函数名称，所以关键之处就在于入参和返回值怎么将Python的数据类型和C语言数据类型一一对应起来。

1、加载dll文件

要调用C语言写的函数，首先需要在Python中加载动态链接库，可以是自己编译的动态链接库也可以是系统的动态链接库。在Windows中一般是dll文件或者pyd文件，在Linux系统中一般是so文件。

from ctypes import *

dll_obj = CDLL('some.dll')
#dll_obj = PyDLL('some.dll')
#dll_obj = WinDLL('some.dll')  # 只用于Windows
#dll_obj = OleDLL('some.dll')  # 只用于Windows

至于应该使用CDLL或者WinDLL还是OleDLL，是根据库函数的约定调用方式决定的。CDLL是cdecl调用协议导出的函数加载方式，而 WinDLL是按照stdcall调用协议调用其中的函数。 OleDLL也是按照stdcall调用协议调用其中的函数，并假定该函数返回的是Windows HRESULT错误代码，当函数调用失败时根据该代码抛OSError异常。另外从源码可以看出PyDLL方式则是继承自CDLL：

class PyDLL(CDLL):
    """This class represents the Python library itself.  It allows
    accessing Python API functions.  The GIL is not released, and
    Python exceptions are handled correctly.
    """
    _func_flags_ = _FUNCFLAG_CDECL | _FUNCFLAG_PYTHONAPI

另外也可以使用xyzDll.LoadLibrary()方法打开链接库文件：

from ctypes import *

dll_obj = cdll.LoadLibrary('some.dll')
#dll_obj = pydll.LoadLibrary('some.dll')
#dll_obj = windll.LoadLibrary('some.dll')  # 只用于Windows
#dll_obj = oledll.LoadLibrary('some.dll')  # 只用于Windows

通过上述2种方法生成了一个dll接口实例，就可以用“实例.函数()”的方式调用动态链接库中的函数了。我们先看一个例子，调用标准C库函数isupper()检查某个字符是否为大写，如果入参为大写字符，返回的结果为非0值，反之为0，在Windows中大部分的标准C库函数位于msvrt.dll中，采用CDLL的方式加载。 C标准库函数中isupper()的入参为int类型，所以这个例子中我们用ord()函数将字符转换为int数据类型：

print('-----欢迎来到www.juzicode.com')
print('-----公众号: 桔子code/juzicode\n')   

from ctypes import *

libc = CDLL('msvcrt.dll')       #加载dll
ret = libc.isupper(ord('A'))    #用对象名.函数名的方法调用c函数
print('isupper(\'A\'):',ret)
ret = libc.isupper(ord('a'))
print('isupper(\'a\'):',ret)

==========结果：
isupper('A'): 1
isupper('a'): 0

在C语言里如果函数类型为int型，形参传入的即使是char型数据，会强制进行数据转换，所以在C语言中isupper()函数可以直接输入单个字符进行调用，我们试下像C语言里直接输入字符看看效果：

from ctypes import *

libc = CDLL('msvcrt.dll')#加载dll
ret = libc.isupper('A')
print('isupper(\'A\'):',ret)
ret = libc.isupper('a')
print('isupper(\'a\'):',ret)

==========结果：
isupper('A'): 0
isupper('a'): 0

从上面的例子可以看到isupper()传入大写字母’A’居然都返回了0！！！至于为什么返回的结果不正确，这里先挖个坑，后面再来解释。

2、基本数据类型

在ctypes中，定义了多种数据类型，用来和C语言数据类型和Python常见数据类型相互对应，比如c_int类型对应了C语言的int和Python的int类型，c_double类型则对应了C语言的double和Python的float类型。

ctypes 类型	C 类型	Python 类型
c_bool	_Bool	bool
c_char	char	单字符字节对象/1个字符的bytes对象
c_wchar	wchar_t	单字符字符串/1个字符的str对象
c_byte	char	int
c_ubyte	unsigned char	int
c_short	short	int
c_ushort	unsigned short	int
c_int	int	int
c_uint	unsigned int	int
c_long	long	int
c_ulong	unsigned long	int
c_longlong	__int64 或 long long	int
c_ulonglong	unsigned __int64 或 unsigned long long	int
c_size_t	size_t	int
c_ssize_t	ssize_t 或 Py_ssize_t	int
c_float	float	float
c_double	double	float
c_longdouble	long double	float
c_char_p	char * (以 NUL 结尾)	字节串（bytes对象）或 None
c_wchar_p	wchar_t * (以 NUL 结尾)	字符串（str对象）或 None
c_void_p	void *	int 或 None

定义一个ctypes数据类型对象的方法：a=类型(值)；或者先声明对象类型再赋值：a=类型()；a.value=值。

a = c_int(100)  #直接定义和赋值

a=c_int()    #先定义类型
a.value=100  #再赋值

2.1 c_int，c_long


from ctypes import *

a = c_int(100)
print('a=',a)
print('a.value=',a.value)
 
b = c_long(100)
print('b=',b)
print('b.value=',b.value)

==========结果：
a= c_long(100)  #定义的是c_int类型，但是打印出来的是c_long类型
a.value= 100
b= c_long(100)
b.value= 100

从上面的例子可以看出，变量a定义的是c_int类型，但是打印出来的是c_long类型，我们从ctypes的源码中可以看到，如果int和long型的数据长度是一致的，c_int就是c_long的一个别名而已：

if _calcsize("i") == _calcsize("l"):
    # if int and long have the same size, make c_int an alias for c_long
    c_int = c_long
    c_uint = c_ulong
else:
    class c_int(_SimpleCData):
        _type_ = "i"
    _check_size(c_int)

    class c_uint(_SimpleCData):
        _type_ = "I"
    _check_size(c_uint)

在Python中int类型的数据理论上是可以无限大的，所以在传入到C函数中时可能会进行截断以适应C类型的整型长度，调用时需要注意这点，否则可能得到的不是自己想要的结果。

2.2 c_double

接下来看下c_double类型的数据，尝试用标准C库函数的pow(x,y)计算x的y次幂：

from ctypes import *

a = c_double(3.0)
b = c_double(2.0)

print('a=',a)
print('a.value=',a.value)
print('b=',b)
print('b.value=',b.value)

libc = CDLL('msvcrt.dll')   #加载dll

ret = libc.pow(a,b)         #用对象名.函数名的方法调用c函数
print('pow(a,b):',ret)

==========结果：
a= c_double(3.0)
a.value= 3.0
b= c_double(2.0)
b.value= 2.0
pow(a,b): 11

从这个例子可以看出并没有得到预期的结果9，因为在ctypes中默认的是int类型，如果没有按照C语言函数的实际类型声明其返回值和入参类型，默认按照c_int类型传入和返回，所以得到的结果是错误的。标准的做法是在调用这个函数前声明其返回类型restype和入参类型argtypes，其中argtypes是一个tuple，需要注意的是即使只有一个入参也必须表示成tuple的形式：(arg1 , ) 其中组成tuple中的逗号是不能少的，否则会报“TypeError: argtypes must be a sequence of types”异常！

from ctypes import *

a = c_double(3.0)
b = c_double(2.0)

print('a=',a)
print('a.value=',a.value)
print('b=',b)
print('b.value=',b.value)

libc = CDLL('msvcrt.dll')    

libc.pow.restype = c_double   #声明返回类型
libc.pow.argtypes = (c_double, c_double) #####声明入参类型
ret = libc.pow(a,b)          
print('pow(a,b):',ret)

==========结果：
a= c_double(3.0)
a.value= 3.0
b= c_double(2.0)
b.value= 2.0
pow(a,b): 9.0

如上形式声明C函数的入参和返回类型后，得到的结果就是正确的了。

C函数返回类型和入参类型声明：
当函数的入参和返回值是非int类型时，调用这个函数前必须声明其返回类型restype和入参类型argtypes，其中argtypes是一个tuple，需要注意的是即使只有一个入参也必须表示成tuple的形式：(arg1 , ) 。

2.3 c_float,c_longlong

前面利用标准C库函数熟悉了c_int和c_double类型，其他的C语言基本数据类型也可以类比得出使用方法，接下来我们试着自己编译dll文件来研究不同类型的入参和返回值，这部分内容需要对C语言有一定的了解，下面涉及到C语言部分的代码在vs2015上编译，编译生成的dll文件bit版本需要和Python的bit版本保持一致。

//C语言中定义了几个类型的加法函数：
#ifdef __cplusplus
extern "C" {
#endif 

__declspec(dllexport) int addi(int x, int y) 
{
	return x + y;
}

__declspec(dllexport) double addd(double x, double y)
{
	return x + y;
}

__declspec(dllexport) float addf(float x, float y)
{
	return x + y;
}

__declspec(dllexport) long long addll(long long x, long long y)
{
	return x + y;
}

#ifdef __cplusplus
}
#endif

#Python中使用这些函数

from ctypes import *
pyt = CDLL('pytest.dll')   #加载dll

#int型的返回值和入参可以不声明类型
ret = pyt.addi(55,22)
print('addi(55,22)=',ret)

#以下其他类型的返回值和入参必须声明类型
pyt.addd.restype=c_double
pyt.addd.argtypes=(c_double,c_double)
ret = pyt.addd(5.55555555,2.22222222)
print('addd(5.55555555,2.22222222)=',ret)
 
pyt.addf.restype=c_float
pyt.addf.argtypes=(c_float,c_float)
ret = pyt.addf(5.55,2.22)
print('addf(5.55,2.22)=',ret)

pyt.addll.restype=c_longlong
pyt.addll.argtypes=(c_longlong,c_longlong)
ret = pyt.addll(5555555555555555555,2222222222222222222)
print('addll(5555555555555555555,2222222222222222222)=',ret)

==========结果：
addi(55,22)= 77
addd(5.55555555,2.22222222)= 7.77777777
addf(5.55,2.22)= 7.770000457763672
addll(5555555555555555555,2222222222222222222)= 7777777777777777777

2.4 字符或字符串数据

字符或字符串数据是一种比较特殊的存在，这里单独拎出来讲讲。

//C语言函数，入参是char型，返回也是char型：
__declspec(dllexport) char trans_c(char y)
{
	return y ;
}

查对应关系表，C语言的char型数据，可以对应ctypes的c_char或者c_byte类型，分别对应到Python的bytes对象和int对象。所以可以写成下面2种形式：

print('-----欢迎来到www.juzicode.com')
print('-----公众号: 桔子code/juzicode\n')    


from ctypes import *

pyt = CDLL('pytest.dll')   #加载dll

pyt.trans_c.restype=c_char
pyt.trans_c.argtypes=(c_char,)
ret = pyt.trans_c(b'A')   #因为对应Python的bytes对象，所以不能直接输入'A'作为入参，必须使用b'A'作为入参。
print('trans_c(A)=',ret)


pyt.trans_c.restype=c_byte
pyt.trans_c.argtypes=(c_byte,)
ret = pyt.trans_c(ord('A')) #因为对应Python的int对象，所以不能直接输入'A'作为入参，必须使用ord'A'转换为int后作为入参。
print('trans_c(A)=',ret)

==========结果：
trans_c(A)= b'A'
trans_c(A)= 65

从上面的例子可以看出来，如果C语言函数是char型的数据类型，并不能直接用Python中的str类型数据直接作为入参传递给C函数，而是需要转换为ctypes的bytes或者int类型。这也就是前面在Python中使用标准C库函数isupper()直接用Python str对象’A’作为入参时为什么会得到错误结果的原因。

下面的例子看下使用wchar_t型数据作为入参时的情况，从前面数据类型的对应关系可以看到，可以使用Python单字符str对象作为入参传递给C函数的，同样当使用多个字符组成的str作为入参时就会报错了：

//C语言函数，入参是wchar_t 型，返回也是wchar_t 型：
__declspec(dllexport) wchar_t trans_wc(wchar_t y)
{
	return y;
}

from ctypes import *
pyt = CDLL('pytest.dll')   #加载dll

pyt.trans_wc.restype=c_wchar
pyt.trans_wc.argtypes=(c_wchar,)
ret = pyt.trans_wc('桔')
print('trans_wc(A)=',ret)

ret = pyt.trans_wc('桔子code')
print('trans_wc(A)=',ret)

==========结果：
trans_wc(A)= 桔

Traceback (most recent call last):
  File "E:\juzicode\py3study\m8-c接口\struct\ctypes-datatype-char.py", line 19, in <module>
    ret = pyt.trans_wc('桔子code')
ctypes.ArgumentError: argument 1: <class 'TypeError'>: wrong type

从前面的例子可以看到，c_char,c_byte,c_wchar都只能传递单个字符，当需要传递多个字符组成的字符串时，可以使用c_char_p和c_wchar_p指针作为传递对象：

__declspec(dllexport) char* trans_cp(char* y)
{
	return y;
}
__declspec(dllexport) wchar_t* trans_wcp(wchar_t* y)
{
	return y;
}

from ctypes import *
pyt = CDLL('pytest.dll')   #加载dll
pyt.trans_cp.restype=c_char_p
pyt.trans_cp.argtypes=(c_char_p,)
ret = pyt.trans_cp(b'juzicode')
print('trans_cp()=',ret)

pyt.trans_wcp.restype=c_wchar_p
pyt.trans_wcp.argtypes=(c_wchar_p,)
ret = pyt.trans_wcp('桔子code')
print('trans_wcp()=',ret)

==========结果：
trans_cp()= b'juzicode'
trans_wcp()= 桔子code

字符类型小结：
c_char: 用作单个字符传递，对应python bytes类型，用b’X’表示，对应c语言的char类型；
c_wchar: 用作单个宽字符传递，对应python str类型，直接用’X’传递，对应c语言的wchar_t类型；
c_char_p：用作字符串传递，对应python的bytes类型，用b’XYZ’表示，对应c语言的char*类型；
c_wcahr_p：用作字符串传递，对应python的bytes类型，直接用’XYZ’传递，对应c语言的wchar_t*类型；

3 数组类型

在ctypes中，基于前述的基本数据类型，还可以构建自定义的数组类型，自定义数组类型的方法：新类型名 = ctypes基本类型名*长度n，赋值方法：对象名=新类型名(赋值1，赋值2……赋值n) ；下面的例子是创建一个包含10个int类型的数组并且赋值：

c_int_10 = c_int * 10  
arr_int = c_int_10(1,2,3,4,5,6,7,8,9,10)# 赋值的长度不能大于前面声明的长度
print('arr_int:',arr_int)
for ar in arr_int:
	print(ar,end=' ')

==========结果：
arr_int: <__main__.c_long_Array_10 object at 0x0000018840C8FC48>
1 2 3 4 5 6 7 8 9 10

从上面的例子可以看出这个新定义的数据类型是可以迭代的。下面这个C函数的例子将包含10个元素的int数组作为入参传入，在函数内部相加并返回和。声明参数类型时就使用这个新的自定义参数类型，比如下面这个例子中C函数的入参为int x[10]，在Python中的入参类型声明是这样的pyt.trans_array_i.argtypes=(c_int_10,)：

__declspec(dllexport) long long trans_array_i(int x[10])
{
	long long sum = 0;
	printf("in c:");
	for (int i = 0; i < 10; i++)
	{
		printf("%d ", x[i]);
		sum += x[i];
	}
	printf("\n");
	return sum;
}

print('-----欢迎来到www.juzicode.com')
print('-----公众号: 桔子code/juzicode\n')    


from ctypes import *
#加载dll
pyt = CDLL('pytest.dll')   
#定义新的数据类型
c_int_10 = c_int * 10  
arr_int = c_int_10(1,2,3,4,5,6,7,8,9,10)# 赋值的长度不能大于前面声明的长度
#声明c函数的返回值和入参类型
pyt.trans_array_i.restype=c_longlong
pyt.trans_array_i.argtypes=(c_int_10,)
#调用c函数
ret = pyt.trans_array_i(arr_int)
print('trans_array_i()=',ret)

==========结果：
in c:1 2 3 4 5 6 7 8 9 10
trans_array_i()= 55

下面的例子中的函数，入参是一个包含20个元素的字符数组，函数的作用是打印字符数组的内容，并返回其下标为0的字符：

__declspec(dllexport) char trans_array_c(char x[20])
{
	for (int i = 0; i < 20; i++)
	{
		printf("%c ", x[i]);
	}
	printf("\n");
	return x[0];
}

from ctypes import *
#加载dll
pyt = CDLL('pytest.dll')   
#定义新的数据类型
c_char_20 = c_char * 20  
arr_char = c_char_20(b'j',b'u',b'z',b'i',b'c',b'o',b'd',b'e',b'.',b'c',b'o',b'm')
pyt.trans_array_c.restype=c_char
pyt.trans_array_c.argtypes=(c_char_20,)
#调用c函数
ret = pyt.trans_array_c(arr_char)
print('trans_array_c()=',ret)

==========结果：
j u z i c o d e . c o m
trans_array_c()= b'j'

这篇文章介绍了怎么在Python中加载动态链接库文件，ctypes基本数据类型、数组类型以及入参和返回值的声明方式，特别说明了容易掉坑的字符数据类型的注意点。接下来的文章将更深入介绍ctypes如何使用自定义数据类型、指针等内容。

小结：1.根据C函数的调用约定决定使用CDLL或者WinDLL加载动态链接库；2.C函数入参或返回值如果是非int类型的数值，需要用argtypes和restype声明数据类型，argtypes是一个元组；3.c_char和c_wchar只能传递单个字符的bytes和str类型，c_char_p和c_wchar_p可以用来传递多字符bytes和str。