编码方式:utf8mb4:通过 show variables like ‘character_set_%’; 可以查看系统默认字符集。mysql中有utf8和utf8mb4两种编码,在mysql中请大家忘记utf8,永远使用utf8mb4。这是mysql的一个遗留问题,mysql中的utf8最多只能支持3bytes长度的字符编码,对于一些需要占据4bytes的文字,mysql的utf8就不支持了,要使用utf8mb4才行
使用CAS(Compare and Swap)操作:在编程语言层面,通过CAS操作来比较内存中的值与预期值是否相等,如果相等则修改,否则放弃修改。 使用乐观锁和悲观锁的选择取决于应用场景和需求:悲观锁适合在并发冲突频繁的情况下,通过独占资源避免并发问题,但会对系统性能产生一定的影响。乐观锁适合在并发冲突较少的情况下,通过乐观的并发控制机制提高系统性能,但需要处理冲突的情况。在实际使用时,需要根据具体业务场景和需求选择适当的并发控制机制,并注意处理冲突和回滚事务的策略,以确保数据的一致性和完整性。
explain select * from test_xxxx_tab txt order by id limit 10000,10; explain SELECT * from test_xxxx_tab txt where id >= (select id from test_xxxx_tab txt order by id limit 10,1) limit 10; id列:在复杂的查询语句中包含多个查询使用id标示 select_type:select/subquery/derived/union table: 显示对应行正在访问哪个表 type:访问类型,关联类型。非常重要,All,index,range,ref,const, possible_keys: 显示可以使用哪些索引列 key列:显示mysql决定使用哪个索引来优化对该表的访问 key_len:显示在索引里使用的字节数 rows:为了找到所需要的行而需要读取的行数
慢查询日志样例子
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# Time: 2022-05-10T10:15:32.123456Z # User@Host: myuser[192.168.0.1] @ localhost [] Id: 12345 # Query_time: 3.456789 Lock_time: 0.123456 Rows_sent: 10 Rows_examined: 100000 SET timestamp=1657475732; SELECT * FROM orders WHERE customer_id = 1001 ORDER BY order_date DESC LIMIT 10; 这个慢查询日志示例包含以下重要的信息:
时间戳(Time): 日志记录的时间,以 UTC 时间表示。 用户和主机(User@Host): 执行查询的用户和主机地址。 连接 ID(Id): 表示执行查询的连接 ID。 查询时间(Query_time): 查询执行所花费的时间,以秒为单位。 锁定时间(Lock_time): 在执行查询期间等待锁定资源所花费的时间,以秒为单位。 返回行数(Rows_sent): 查询返回的结果集中的行数。 扫描行数(Rows_examined): 在执行查询过程中扫描的行数。 时间戳(SET timestamp): 查询开始执行的时间戳。 查询语句(SELECT * FROM orders WHERE customer_id = 1001 ORDER BY order_date DESC LIMIT 10): 实际执行的查询语句
show global variables; show variables like '%max_connection%'; 查看最大连接数 show status like 'Threads%'; show processlist; show variables like '%connection%';
-- desc information_schema.tables; -- 查看 MySQL「所有库」的容量大小 SELECT table_schema AS '数据库', SUM(table_rows) AS '记录数', SUM(truncate(data_length / 1024 / 1024, 2)) AS '数据容量(MB)', SUM(truncate(index_length / 1024 / 1024, 2)) AS '索引容量(MB)', SUM(truncate(DATA_FREE / 1024 / 1024, 2)) AS '碎片占用(MB)' FROM information_schema.tables GROUP BY table_schema ORDER BY SUM(data_length) DESC, SUM(index_length) DESC; -- 指定书库查看表的数据量 SELECT table_schema as '数据库', table_name as '表名', table_rows as '记录数', truncate(data_length/1024/1024, 2) as '数据容量(MB)', truncate(index_length/1024/1024, 2) as '索引容量(MB)', truncate(DATA_FREE/1024/1024, 2) as '碎片占用(MB)' from information_schema.tables where table_schema='<数据库名>' order by data_length desc, index_length desc;
mysql登陆: mysql -h主机 -P端口 -u用户 -p密码 SET PASSWORD FOR ‘root‘@’localhost’ = PASSWORD(‘root’); create database wxquare_test; show databases; use wxquare_test;
UPDATE employees SET salary = CASE WHEN grade = 'A' THEN salary * 1.1 WHEN grade = 'B' THEN salary * 1.05 WHEN grade = 'C' THEN salary * 1.03 ELSE salary END WHERE department = 'IT';
为设计文档的目标读者提供理解详细设计所需的背景信息。按读者范围来提供背景。见上文关于目标读者的圈定。设计文档应该是“自足的”(self-contained),即应该为读者提供足够的背景知识,使其无需进一步的查阅资料即可理解后文的设计。保持简洁,通常以几段为宜,每段简要介绍即可。如果需要向读者提供进一步的信息,最好只提供链接。警惕知识的诅咒(知识的诅咒(Curse of knowledge)是一种认知偏差,指人在与他人交流的时候,下意识地假设对方拥有理解交流主题所需要的背景知识)
域名系统是把 www.example.com 等域名转换成 IP 地址。域名系统是分层次的,一些 DNS 服务器位于顶层。当查询(域名) IP 时,路由或 ISP 提供连接 DNS 服务器的信息。较底层的 DNS 服务器缓存映射,它可能会因为 DNS 传播延时而失效。DNS 结果可以缓存在浏览器或操作系统中一段时间,时间长短取决于存活时间 TTL。
不论请求成功或失败,始终返回 200 http status code,在 HTTP Body 中包含用户账号没有找到的错误信息:
1 2 3 4 5 6 7 8 9 10 11 12 13
如: Facebook API 的错误 Code 设计,始终返回 200 http status code: { "error": { "message": "Syntax error \"Field picture specified more than once. This is only possible before version 2.1\" at character 23: id,name,picture,picture", "type": "OAuthException", "code": 2500, "fbtrace_id": "xxxxxxxxxxx" } }
缺点: 对于每一次请求,我们都要去解析 HTTP Body,从中解析出错误码和错误信息
返回 http 404 Not Found 错误码,并在 Body 中返回简单的错误信息:
1 2 3 4 5
如: Twitter API 的错误设计 根据错误类型,返回合适的 HTTP Code,并在 Body 中返回错误信息和自定义业务 Code
HTTP/1.1 400 Bad Request {"errors":[{"code":215,"message":"Bad Authentication data."}]}
返回 http 404 Not Found 错误码,并在 Body 中返回详细的错误信息:
1 2 3 4 5 6 7
如: 微软 Bing API 的错误设计,会根据错误类型,返回合适的 HTTP Code,并在 Body 中返回详尽的错误信息 HTTP/1.1 400 { "code": 100101, "message": "Database error", "reference": "https://github.com/xx/tree/master/docs/guide/faq/xxxx" }
sudo ip netns add ns1 sudo ip netns add ns2 sudo ip netns add ns3
sudo brctl addbr virtual-bridge
sudo ip link add veth-ns1 type veth peer name veth-ns1-br sudo ip linkset veth-ns1 netns ns1 sudo brctl addif virtual-bridge veth-ns1-br
sudo ip link add veth-ns2 type veth peer name veth-ns2-br sudo ip linkset veth-ns2 netns ns2 sudo brctl addif virtual-bridge veth-ns2-br
sudo ip link add veth-ns3 type veth peer name veth-ns3-br sudo ip linkset veth-ns3 netns ns3 sudo brctl addif virtual-bridge veth-ns3-br
sudo ip -n ns1 addr add local 192.168.1.1/24 dev veth-ns1 sudo ip -n ns2 addr add local 192.168.1.2/24 dev veth-ns2 sudo ip -n ns3 addr add local 192.168.1.3/24 dev veth-ns3
sudo ip linkset virtual-bridge up sudo ip linkset veth-ns1-br up sudo ip linkset veth-ns2-br up sudo ip linkset veth-ns3-br up sudo ip -n ns1 linkset veth-ns1 up sudo ip -n ns2 linkset veth-ns2 up sudo ip -n ns3 linkset veth-ns3 up
sudo ip netns delete ns1 sudo ip netns delete ns2 sudo ip netns delete ns3 sudo ip linkset virtual-bridge down sudo brctl delbr virtual-bridge
$ sudo ip netns exec ns1 ping 192.168.1.2 PING 192.168.1.2 (192.168.1.2): 56 data bytes 64 bytes from 192.168.1.2: seq=0 ttl=64 time=0.068 ms --- 192.168.1.2 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.060/0.064/0.068 ms $ sudo ip netns exec ns1 ping 192.168.1.3 PING 192.168.1.3 (192.168.1.3): 56 data bytes 64 bytes from 192.168.1.3: seq=0 ttl=64 time=0.055 ms --- 192.168.1.3 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.055/0.378/1.016 ms
Go and C++ are two different programming languages with different design goals, syntax, and feature sets. Here’s a brief comparison of the two:
Syntax: Go has a simpler syntax than C++. It uses indentation for block structure and has fewer keywords and symbols. C++ has a more complex syntax with a lot of features that can make it harder to learn and use effectively.
Memory Management: C++ gives the programmer more control over memory management through its support for pointers, manual memory allocation, and deallocation. Go, on the other hand, uses a garbage collector to automatically manage memory, making it less error-prone.
Concurrency: Go has built-in support for concurrency through goroutines and channels, which make it easier to write concurrent code. C++ has a thread library that can be used to write concurrent code, but it requires more manual management of threads and locks.
Performance: C++ is often considered a high-performance language, and it can be used for system-level programming and performance-critical applications. Go is also fast but may not be as fast as C++ in some cases.
Libraries and Frameworks: C++ has a vast ecosystem of libraries and frameworks that can be used for a variety of applications, from game development to machine learning. Go’s ecosystem is smaller, but it has good support for web development and distributed systems.
Overall, the choice of programming language depends on the project requirements, the available resources, and the developer’s expertise. Both Go and C++ have their strengths and weaknesses, and the best choice depends on the specific needs of the project.
func main() { done := make(chan struct{}) s := make(chan int) go func() { s <- 1 close(done) }() fmt.Println(<-s) <-done } func main() { sem := make(chan struct{}, 2) //two groutine var wg sync.WaitGroup for i := 0; i < 10; i++ { wg.Add(1) go func(id int) { defer wg.Done() defer func() { <-sem }() sem <- struct{}{} time.Sleep(1 * time.Second) fmt.Println("id=", id) }(i) } wg.Wait() } func main() { go func() { tick := time.Tick(1 * time.Second) for { select { case <-time.After(5 * time.Second): fmt.Println("time out") case <-tick: fmt.Println("time tick 1s") default: fmt.Println("default") } } }() <-(chan struct{})(nil) }
Go并发模型 (Goroutine/channel/GMP)
what’s CSP?
The Communicating Sequential Processes (CSP) model is a theoretical model of concurrent programming that was first introduced by Tony Hoare in 1978. The CSP model is based on the idea of concurrent processes that communicate with each other by sending and receiving messages through channels.The Go programming language provides support for the CSP model through its built-in concurrency features, such as goroutines and channels. In Go, concurrent processes are represented by goroutines, which are lightweight threads of execution. The communication between goroutines is achieved through channels, which provide a mechanism for passing values between goroutines in a safe and synchronized manner.
Which is Goroutine ?
Goroutines are lightweight, user-level threads of execution that run concurrently with other goroutines within the same process.
Unlike traditional threads, goroutines are managed by the Go runtime, which automatically schedules and balances their execution across multiple CPUs and makes efficient use of available system resources.
Goroutines, threads, and processes are all mechanisms for writing concurrent and parallel code, but they have some important differences:
Goroutines: A goroutine is a lightweight, user-level thread of execution that runs concurrently with other goroutines within the same process. Goroutines are managed by the Go runtime, which automatically schedules and balances their execution across multiple CPUs. Goroutines require much less memory and have much lower overhead compared to threads, allowing for many goroutines to run simultaneously within a single process.
Threads: A thread is a basic unit of execution within a process. Threads are independent units of execution that share the same address space as the process that created them. This allows threads to share data and communicate with each other, but also introduces the need for explicit synchronization to prevent race conditions and other synchronization issues.
Processes: A process is a self-contained execution environment that runs in its own address space. Processes are independent of each other, meaning that they do not share memory or other resources. Communication between processes requires inter-process communication mechanisms, such as pipes, sockets, or message queues.
In general, goroutines provide a more flexible and scalable approach to writing concurrent code compared to threads, as they are much lighter and more efficient, and allow for many more concurrent units of execution within a single process. Processes provide a more secure and isolated execution environment, but have higher overhead and require more explicit communication mechanisms.
Why is Goroutine lighter and more efficient than thread or process?
Stack size: Goroutines have a much smaller stack size compared to threads. The stack size of a goroutine is dynamically adjusted by the Go runtime, based on the needs of the goroutine. This allows for many more goroutines to exist simultaneously within a single process, as they require much less memory.
Scheduling: Goroutines are scheduled by the Go runtime, which automatically balances and schedules their execution across multiple CPUs. This eliminates the need for explicit thread management and synchronization, reducing overhead.
Context switching: Context switching is the process of saving and restoring the state of a running thread in order to switch to a different thread. Goroutines have a much lower overhead for context switching compared to threads, as they are much lighter and require less state to be saved and restored.
Resource sharing: Goroutines share resources with each other and with the underlying process, eliminating the need for explicit resource allocation and deallocation. This reduces overhead and allows for more efficient use of system resources.
Overall, the combination of a small stack size, efficient scheduling, low overhead context switching, and efficient resource sharing makes goroutines much lighter and more efficient than threads or processes, and allows for many more concurrent units of execution within a single process.
Cooperative (协作式). The scheduler uses a cooperative scheduling model, which means that goroutines voluntarily yield control to the runtime when they are blocked or waiting for an event.
Timer-based preemption. The scheduler uses a technique called timer-based preemption to interrupt the execution of a running goroutine and switch to another goroutine if it exceeds its time slice
Work-stealing. The scheduler uses a work-stealing algorithm, where each CPU has its own local run queue, and goroutines are dynamically moved between run queues to balance the o balance the load and improve performance.
no explicit prioritization. The Go runtime scheduler does not provide explicit support for prioritizing goroutines. Instead, it relies on the cooperative nature of goroutines to ensure that all goroutines make progress. In a well-designed Go program, the program should be designed such that all goroutines make progress in a fair and balanced manner.
G 的数量可以远远大于 M 的数量,换句话说,Go 程序可以利用少量的内核级线程来支撑大量 Goroutine 的并发。多个 Goroutine 通过用户级别的上下文切换来共享内核线程 M 的计算资源,但对于操作系统来说并没有线程上下文切换产生的性能损耗,支持任务窃取(work-stealing)策略:为了提高 Go 并行处理能力,调高整体处理效率,当每个 P 之间的 G 任务不均衡时,调度器允许从 GRQ,或者其他 P 的 LRQ 中获取 G 执行。
如果在 Goroutine 去执行一个 sleep 操作,导致 M 被阻塞了。Go 程序后台有一个监控线程 sysmon,它监控那些长时间运行的 G 任务然后设置可以强占的标识符,别的 Goroutine 就可以抢先进来执行。
What are the states of Goroutine and how do they flow?
协程的状态流转?Grunnable、Grunning、Gwaiting
In Go, a Goroutine can be in one of several states during its lifetime. The states are:
New: The Goroutine is created but has not started executing yet.
Running: The Goroutine is executing on a machine-level thread.
Waiting: The Goroutine is waiting for some external event, such as I/O, channel communication, or a timer.
Sleeping: The Goroutine is sleeping, or waiting for a specified amount of time.
Dead: The Goroutine has completed its execution and is no longer running.
In summary, the lifetime of a Goroutine in Go starts when it is created and ends when it completes its execution or encounters a panic, and can be influenced by synchronization mechanisms such as channels and wait groups.
What are the memory leak scenarios in Go language?
Goroutine leaks: If a goroutine is created and never terminated, it can result in a memory leak. This can occur when a program creates a goroutine to perform a task but fails to provide a mechanism for the goroutine to terminate, such as a channel to receive a signal to stop.
Leaked closures: Closures are anonymous functions that capture variables from their surrounding scope. If a closure is created and assigned to a global variable, it can result in a memory leak, as the closure will continue to hold onto the captured variables even after they are no longer needed.
Incorrect use of channels: Channels are a mechanism for communicating between goroutines. If a program creates a channel but never closes it, it can result in a memory leak. Additionally, if a program receives values from a channel but never discards them, they will accumulate in memory and result in a leak.
Unclosed resources: In Go, it’s important to close resources, such as files and network connections, when they are no longer needed. Failure to do so can result in a memory leak, as the resources and their associated memory will continue to be held by the program.
Unreferenced objects: In Go, unreferenced objects are objects that are no longer being used by the program but still exist in memory. This can occur when an object is created and never explicitly deleted or when an object is assigned a new value and the old object is not properly disposed of. By following best practices and being mindful of these common scenarios, you can help to avoid memory leaks in your Go programs. Additionally, you can use tools such as the Go runtime profiler to detect and diagnose memory leaks in your programs.
Marking phase: In this phase, the Go runtime identifies all objects that are accessible by the program and marks them as reachable. Objects that are not marked as reachable are considered unreachable and eligible for collection.
Sweeping phase: In this phase, the Go runtime scans the memory heap and frees all objects that are marked as unreachable. The memory space occupied by these objects is now available for future allocation.
Compacting phase: In this phase, the Go runtime rearranges the remaining objects on the heap to reduce fragmentation and minimize the impact of future allocations and deallocations.
标记-清扫: 标记清扫是古老的垃圾回收算法,出现在70年代。通过指定每个内存阈值或者时间长度,垃圾回收器会挂起用户程序,也称为STW(stop the world)。垃圾回收器gc会对程序所涉及的所有对象进行一次遍历以确定哪些内存单元可以回收,因此分为标记(mark)和清扫(sweep),标记阶段标明哪些内存在使用不能回收,清扫阶段将不需要的内存单元释放回收。标记清扫法最大的问题是需要STW,当程序使用的内存较多时,其性能会比较差,延时较高。
In Go, a closure is a function that has access to variables from its outer (enclosing) function’s scope. The closure “closes over” the variables, meaning that it retains access to them even after the outer function has returned. This makes closures a powerful tool for encapsulating data and functionality and for creating reusable code.
funcmemoize(f func(int)int) func(int)int { cache := make(map[int]int) returnfunc(n int)int { if val, ok := cache[n]; ok { return val } result := f(n) cache[n] = result return result } }
funcfibonacci(n int)int { if n <= 1 { return n } return fibonacci(n-1) + fibonacci(n-2) }
funcmain() { fib := memoize(fibonacci) for i := 0; i < 10; i++ { fmt.Println(fib(i)) } }
Factorial
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
package main
import"fmt"
funcmain() { factorial := func(n int)int { if n <= 1 { return1 } return n * factorial(n-1) }
// Local per-P Pool appendix.
type poolLocalInternal struct {
private interface{} // Can be used only by the respective P.
shared []interface{} // Can be used by any P.
Mutex // Protects shared.
}
$ go build -gcflags=-m test_esc.go command-line-arguments ./test_esc.go:9:17: Sum make([]int, count) does not escape ./test_esc.go:23:13: answer escapes to heap ./test_esc.go:23:13: main ... argument does not escape
The Go runtime is a collection of software components that provide essential services for Go programs, including memory management, garbage collection, scheduling, and low-level system interaction. The runtime is responsible for managing the execution of Go programs and for providing a consistent, predictable environment for Go code to run in.
At a high level, the Go runtime is responsible for several core tasks:
Memory management: The runtime manages the allocation and deallocation of memory used by Go programs, including the stack, heap, and other data structures.
Garbage collection: The runtime automatically identifies and frees memory that is no longer needed by a program, preventing memory leaks and other related issues.
Scheduling: The runtime manages the scheduling of Goroutines, the lightweight threads used by Go programs, to ensure that they are executed efficiently and fairly.
Low-level system interaction: The runtime provides an interface for Go programs to interact with low-level system resources, including system calls, I/O operations, and other low-level functionality.
The Go runtime is an essential component of the Go programming language, and it is responsible for many of the language’s unique features and capabilities. By providing a consistent, efficient environment for Go code to run in, the runtime enables developers to write high-performance, scalable software that can run on a wide range of platforms and architectures.
找一个golang编译的可执行程序test,info file查看其入口地址:gdb test,info files (gdb) info files Symbols from “/home/terse/code/go/src/learn_golang/test_init/main”. Local exec file: /home/terse/code/go/src/learn_golang/test_init/main’, file type elf64-x86-64. Entry point: 0x452110 …..
利用断点信息找到目标文件信息: (gdb) b *0x452110 Breakpoint 1 at 0x452110: file /usr/local/go/src/runtime/rt0_linux_amd64.s, line 8.
依次找到对应的文件对应的行数,设置断点,调到指定的行,查看具体的内容: (gdb) b _rt0_amd64 (gdb) b b runtime.rt0_go 至此,由汇编代码针对特定平台实现的引导过程就全部完成了,后续的代码都是用Go实现的。分别实现命令行参数初始化,内存分配器初始化、垃圾回收器初始化、协程调度器的初始化等功能。
//go:notinheap type stackpoolItem struct { mu mutex span mSpanList }
// Global pool of large stack spans. var stackLarge struct { lock mutex free [heapAddrBits - pageShift]mSpanList // free lists by log_2(s.npages) }
funcstackinit() { if _StackCacheSize&_PageMask != 0 { throw("cache size must be a multiple of page size") } for i := range stackpool { stackpool[i].item.span.init() lockInit(&stackpool[i].item.mu, lockRankStackpool) } for i := range stackLarge.free { stackLarge.free[i].init() lockInit(&stackLarge.lock, lockRankStackLarge) } }
newproc 需要一个初始的stack
1 2 3 4 5 6
if gp.stack.lo == 0 { // Stack was deallocated in gfput or just above. Allocate a new one. systemstack(func() { gp.stack = stackalloc(startingStackSize) }) gp.stackguard0 = gp.stack.lo + _StackGuard
funcproduce() { a := total / producerLimit b := total % producerLimit var wg sync.WaitGroup for i := 0; i < int(producerLimit); i++ { batch := a if i < int(b) { batch += 1 } wg.Add(1) gofunc(x int32) { defer wg.Done() for j := 0; j < int(x); j++ { num := rand.Intn(10) atomic.AddInt32(&AtomicSum, int32(num)) Q <- int32(num) } }(batch) } gofunc() { wg.Wait() close(Q) }() }
funcconsumer()int32 { var wg sync.WaitGroup for i := 0; i < int(consumerLimit); i++ { wg.Add(1) gofunc() { defer wg.Done() var batchSum int32 = 0 for num := range Q { batchSum += num } SumQ <- batchSum }() }
gofunc() { wg.Wait() close(SumQ) }()
var ans int32 = 0 for sum := range SumQ { ans += sum } return ans }
GIL(Global Interpreter Lock)全局解释锁,CPython在解释执行任何Python代码的时候,首先都需要they acquire GIL when running,release GIL when blocking for I/O。如果没有涉及I/O操作,只有CPU密集型操作时,解释器每隔100一段时间(100ticks)就会释放GIL。GIL是实现Python解释器的(Cython)时所引入的一个概念,不是Python的特性。 由于GIL的存在,使得Python对于计算密集型任务,多线程threading模块形同虚设,因为线程在实际运行的时候必须获得GIL,而GIL只有一个,因此无法发挥多核的优势。为了绕过GIL的限制,只能使用multiprocessing等多进程模块,每个进程各有一个CPython解释器,也就各有一个GIL。
class TemplateTracker { public: TemplateTracker();
void SetSampling(unsigned int sample_i, unsigned int sample_j); void SetLambda(double lamda); void SetIterationMax(unsigned int n); void SetPyramidal(unsigned int nlevels, unsigned int level_to_stop); void SetUseTemplateSelect(bool bselect); void SetThresholdGradient(float threshold);
int Init(unsigned char* imgData, unsigned int h, unsigned int w, int* ref, unsigned int points_num, bool bshow); int InitWithMask(unsigned char* imgData, unsigned int h, unsigned int w, int* ref, unsigned int points_num, bool bshow, unsigned char* mask_data,int h2,int w2);
int ComputeH(unsigned char* imgData, unsigned int h,unsigned int w,float* H_matrix,int num); int ComputeHWithMask(unsigned char* imgData, unsigned int h,unsigned int w,float* H_matrix,int num,unsigned char* mask_data,int h2,int w2); void Reset(); ~TemplateTracker(); void say_hello();
%apply (unsigned char* IN_ARRAY2, int DIM1, int DIM2) {(unsigned char* imgData, unsigned int h, unsigned int w)} %apply (unsigned char* IN_ARRAY2, int DIM1, int DIM2) {(unsigned char* mask_data, int h2, int w2)} %apply (int* IN_ARRAY1, int DIM1) {(int* ref, unsigned int points_num)} %apply (float* INPLACE_ARRAY1, int DIM1) {(float* H_matrix,int num)}
NUMS = 100000 def job2(): ''' cpu and io ''' for i in range(NUMS): print("hello,world")
def multi_threads(num,job): threads = [] for i in range(num): t = threading.Thread(target=job,args=()) threads.append(t) for t in threads: t.start() for t in threads: t.join()
def multi_process(num,job): process = [] for i in range(num): p = multiprocessing.Process(target=job,args=()) process.append(p) for p in process: p.start() for p in process: p.join()
C风格的强制类型转换(Type Cast)很简单,不管什么类型的转换统统是:TYPE b = (TYPE)a。 C++风格的类型转换提供了4种类型转换操作符来应对不同场合的应用。 const_cast:字面上理解就是去const属性。 static_cast:命名上理解是静态类型转换。如int转换成char。类似于C风格的强制转换。无条件转换,静态类型转换。基本类型转换用static_cast。 dynamic_cast:命名上理解是动态类型转换。如子类和父类之间的多态类型转换。有条件转换,动态类型转换,运行时类型安全检查(转换失败返回NULL)。多态类之间的类型转换用daynamic_cast。 reinterpret_cast:仅仅重新解释类型,但没有进行二进制的转换。 4种类型转换的格式,如:TYPE B = static_cast(a) 参考:https://www.cnblogs.com/goodhacker/archive/2011/07/20/2111996.html
9. 关键字volatile的作用
volatile int i = 10 volatile 指出 i 是随时可能发生变化的,每次使用它的时候必须从 i的地址中读取,因而编译器生成的汇编代码会重新从i的地址读取数据放在 b 中。而优化做法是,由于编译器发现两次从 i读数据的代码之间的代码没有对 i 进行过操作,它会自动把上次读的数据放在 b 中。而不是重新从 i 里面读。这样以来,如果 i是一个寄存器变量或者表示一个端口数据就容易出错,所以说volatile直接存取原始内存地址,禁止执行期寄存器的优化。
-include用来包含头文件,但一般情况下包含头文件都在源码里用#include xxxxxx实现,-include参数很少用。-I参数是用来指定头文件目录,/usr/include目录一般是不用指定的,gcc知道去那里找,但 是如果头文件不在/usr/include里我们就要用-I参数指定了,比如头文件放在/myinclude目录里,那编译命令行就要加上-I /myinclude参数了,如果不加你会得到一个”xxxx.h: No such file or directory”的错误。-I参数可以用相对路径,比如头文件在当前目录,可以用-I.来指定。
Perf top 用于实时显示当前系统的性能统计信息。该命令主要用来观察整个系统当前的状态,比如可以通过查看该命令的输出来查看当前系统最耗时的内核函数或某个用户进程。
3. perf record/perf report
使用 top 和 stat 之后,这时对程序基本性能有了一个大致的了解,为了优化程序,便需要一些粒度更细的信息。比如说您已经断定目标程序计算量较大,也许是因为有些代码写的不够精简。那么面对长长的代码文件,究竟哪几行代码需要进一步修改呢?这便需要使用 perf record 记录单个函数级别的统计信息,并使用 perf report 来显示统计结果。您的调优应该将注意力集中到百分比高的热点代码片段上,假如一段代码只占用整个程序运行时间的 0.1%,即使您将其优化到仅剩一条机器指令,恐怕也只能将整体的程序性能提高 0.1%。俗话说,好钢用在刀刃上,要优化热点函数。
动态内存管理错误 常见的内存分配方式分三种:静态存储,栈上分配,堆上分配。全局变量属于静态存储,它们是在编译时就被分配了存储空间,函数内的局部变量属于栈上分配,而最灵活的内存使用方式当属堆上分配,也叫做内存动态分配了。常用的内存动态分配函数包括:malloc, alloc, realloc, new等,动态释放函数包括free, delete。一旦成功申请了动态内存,我们就需要自己对其进行内存管理,而这又是最容易犯错误的。下面的一段程序,就包括了内存动态管理中常见的错误。 a. 使用完后未释放 b. 释放后仍然读写 c. 释放了再释放
==102507== Memcheck, a memory error detector ==102507== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==102507== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info ==102507== Command: ./a.out ==102507== ==102507== Conditional jump or move depends on uninitialised value(s) ==102507== at 0x1091F6: main (learn_valgrind.cpp:14) ==102507== 10 ==102507== Invalid write of size 4 ==102507== at 0x109270: main (learn_valgrind.cpp:23) ==102507== Address 0x4dc30c0 is 0 bytes inside a block of size 40 free'd ==102507== at 0x483A55B: operator delete[](void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==102507== by 0x10926B: main (learn_valgrind.cpp:22) ==102507== Block was alloc'd at ==102507== at 0x48394DF: operator new[](unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==102507== by 0x109254: main (learn_valgrind.cpp:21) ==102507== ==102507== ==102507== HEAP SUMMARY: ==102507== in use at exit: 40 bytes in 1 blocks ==102507== total heap usage: 4 allocs, 3 frees, 73,808 bytes allocated ==102507== ==102507== LEAK SUMMARY: ==102507== definitely lost: 40 bytes in 1 blocks ==102507== indirectly lost: 0 bytes in 0 blocks ==102507== possibly lost: 0 bytes in 0 blocks ==102507== still reachable: 0 bytes in 0 blocks ==102507== suppressed: 0 bytes in 0 blocks ==102507== Rerun with --leak-check=full to see details of leaked memory ==102507== ==102507== For counts of detected and suppressed errors, rerun with: -v ==102507== Use --track-origins=yes to see where uninitialised values come from ==102507== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
udp协议怎么做可靠传输? 由于在传输层UDP已经是不可靠的连接,那就要在应用层自己实现一些保障可靠传输的机制,简单来讲,要使用UDP来构建可靠的面向连接的数据传输,就要实现类似于TCP协议的,超时重传(定时器),有序接受 (添加包序号),应答确认 (Seq/Ack应答机制),滑动窗口流量控制等机制 (滑动窗口协议),等于说要在传输层的上一层(或者直接在应用层)实现TCP协议的可靠数据传输机制,比如使用UDP数据包+序列号,UDP数据包+时间戳等方法。目前已经有一些实现UDP可靠传输的机制,比如UDT(UDP-based Data Transfer Protocol)基于UDP的数据传输协议(UDP-based Data Transfer Protocol,简称UDT)是一种互联网数据传输协议。UDT的主要目的是支持高速广域网上的海量数据传输,而互联网上的标准数据传输协议TCP在高带宽长距离网络上性能很差。 顾名思义,UDT建于UDP之上,并引入新的拥塞控制和数据可靠性控制机制。UDT是面向连接的双向的应用层协议。它同时支持可靠的数据流传输和部分可靠的数据报传输。 由于UDT完全在UDP上实现,它也可以应用在除了高速数据传输之外的其它应用领域,例如点到点技术(P2P),防火墙穿透,多媒体数据传输等等