2012-02-23

The Design Rules of API

=========================
The Design Rules of API
=========================

:Author: gashero
:Date: 2012-02-24

Good API is usable for caller, but bad API is hard to debug. This article will introduce some design rule of API in my opinion. Every one of the rules is base on some failure's lesson.

1. Separate ID-LIST and GET-ONE

Your API need one of ID-LIST or GET-ONE. The ID-LIST API will return a list of object_id, but no detail information about the object. The GET-ONE API will return only ONE object's full information (written by gashero).

If your API designed by this rule, most requirement will be cover by these two type of API. Someone get list, someone get detail. ID-LIST API will return in limit time.

If NOT? If you develop API for every requirement, you will get many API, and many repeat code. API will be not reuseful. And if a requirement need a new object's attribution, you must modify the API. But if you use ID-LIST & GET-ONE, you just only need modify the Application's code, GET-ONE API has enough attribution.

2. Dataset slice, count & offset

Every ID-LIST API need dataset slice, by count & offset. If your requirement do not mention it now, it will be appear later. So, if you have designed it, you do not need modify it later.

Notice some language allow parameter have default value, you can give a default value 0 to offset, but DO NOT give default value to count. Because caller maybe ignore it, and then cause some bug. For example, caller maybe want to get all object id list, but he forget the count parameter, the default count will lost some id.

3. Start with action

API's naming is a difficult task, because you need a meanful name, but avoid same to other API's name. Name a API start with a action is good choice. Like Hungarian notation, name lead with it's class, not data type.

Some useful prefix of API's name:

1. add: create an new object.
2. delete: delete an object.
3. update: modify an object's 1 field value.
4. get: get 1 object's full detail by object_id.
5. list: get list of object's id match some rule.
6. count: get count of object match some rule.

4. Do not call eachother

API would not call eachother, avoid depend on eachother. Independent or reuse, most time, independent is more useful. Sometimes, you may think something is reusable, but actually it not.

5. Set operate

If a API focus update set of objects. Right way is just add/delete the element in the set, not write whole set. Because, write whole set need large communication bandwidth, and this method can not get right result in concurrent environment.

6. No super API

If you provide a API have many feature, caller will abuse it everywhere. When you want to do some reconstruction, these API will kill you. Because you do not know caller how to use it.

Super API break the API, make API not clear, make caller inject the internel of your implemention.

7. More debug information, not "return NULL"

Debug is more important than performance. "return NULL" is not a good idea for debug, because it transfer nothing to caller.

You need define a exceptioin class, with some field to tell caller what is wrong. Some useful field:

1. errnum: error number, can be recognize by program.
2. errmsg: error message, can be read by people.

8. Update API, (id, field, value)

Modify some object's field just need these three parameters. Select an object by object_id, modify its field to value. Field can be select by a string, this parttern can match many requirement.

9. More string as type, no magic number

There is many type field in interface or database design. A string can be readable, it's more friendly to debug. But if a magic number as type field in a large system, caller will crash.

10. Separate interface and logic

Interface is sometimes very simple without logic, too simple to test. But if logic in interface, it's hard to test.

Automatic test is very good for your software quality. So separate interface and logic will make logic suit for test. Unittest will help you increase quality.

11. Do not mix code

Mix code will decrease your program's readability. If you write SQL in your function, code will become confusion.

You can write SQL on top of your source file, and refer in your code.

How to decode forex.com's foreign exchange protocol

What is forex.com & What I do

forex.com is lead vendor for online foreign exchange. It trade 200 billion dollar per month.

I'm interested with foreign exchange rate data, and I want to verify my mathematical model. So I need some real data. forex.com is a good choice.

I first decoded forex.com's foreign exchange communication protocol at Jul-2007. After that, forex.com changed protocol 4 times. Last time I decoded it at Sep-2011, and the program in this article is related to this version.

The protocol decode's product is re-implement a forex.com client by python.

Protocol

Protocol of forex.com's foreign exchange have a lead protocol header, and follow with protocol body. The protocol body repeat many times. If you don't disconnect, data flow will stop after 4 hours. So you need a mechanism to reconnect your client.

The first char of protocol data flow is "S", it is not important, you can drop it. Maybe this char is just for client to confirm this protocol.

You can use regular expression to recognize protocol. It's simpler than your code by manual.

The protocol header is recognize by the regex below:

RE_HEADER=re.compile(r"""^(?P\d+)\\(?P\w{3})\/(?P\w{3})\\(?P[\.\d]+)\\(?P[\.\d]+)\\(?P[\.\d]+)\\(?P[\.\d]+)\\(?P[DR])\\(?P[AE])\\(?P\d)\\(?P[\.\d]+)\\$""")

This pattern will repeat many times, depend on how many foreign currency is in exchange. After 2010, forex.com include gold information in protocol. At Sep-2011, there are 74 currencies pair in forex.com. Protocol Header also has 74 tuples.

"keyname" is useful for protocol body to match data tuple, because they do not want repeat the currency name again in protocol body. "p1" & "p2" are two currency name, like "USD" or "AUD".

"bid", "ask", "high", "low" is past recent data. You do not need it, you will get more fresh data in protocol body. These are just initial data.

I do not know what is "dr", "ae", "num", and "close", but I think these are not important.

There is a example here for AUD/CAD rate, in python dict:

{'p2': 'CAD', 'p1': 'AUD', 'ae': 'A', 'bid': '1.01494', 'high': '1.01890', 'num': '5', 'low': '1.01390', 'ask': '1.01530', 'close': '1.01745', 'dr': 'D', 'keyname': '22'}

If you finished protocol header, there is protocol body. The regex for protocol body is below:

RE_BODY=re.compile(r"""^R(?P\d+)\\(?P[\.\d]+)\\(?P[\.\d]+)\\(?P[DR])\\\\\\(?P\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2})\\$""")

Every time your client received data from server, there are several pairs of tuple. The regex is just for 1 tuple. If you want to record these data, you can print out time now, and then print out tuple at this time.

"keyname" is same to protocol header, for recognize currency pair. "bid" and "ask" is bid and ask price. I do not know what is "dr" too. "datetime" do not need my explanation, I think it is not useful, just increase communication bandwidth.

One line of the data I caught is here, it just a short line:

11:54:06 {'GBP/JPY':{'ask':'120.923','bid':'120.876'},'JPX/JPY':{'ask':'8609','bid':'8581'},'SGD/JPY':{'ask':'61.553','bid':'61.515'},'AUD/USD':{'ask':'1.02326','bid':'1.02302'},'CAD/JPY':{'ask':'77.285','bid':'77.237'},'XAU/AUD':{'ask':'1775.27','bid':'1774.02'}}

The protocol above is from server to client. You need send a request to start a communicate. Request is just a string. One available IP and port is 74.217.51.143:443. Do not panic, this is not SSL protocol. And a available request string is:

"TKN0Hrs5KPx5H6dyboaNcjcS83Rg0sme/kbhufsh0ME4/l4LDf8v35/ZIqJUQj4aUkmr+KHwzD5WniNFhfd5K8YHPUN8TZQh2D8tXy07EAzoQNfypaOjkcubOcF8dDRil4ToBrsYugEF30mTZH843+Xyw==".

Notice, these IP/port and request string is not available now. If you want to do more interesting thing, you must get it by your sniffer.

What is the use for you

You can fetch real world foreign exchange data, and verify your mathematical model.

I think the analysis for forex.com protocol is not bad thing to forex.com. And this analysis is not danger for forex.com.

If you want to talk me about mathematical model about forex trade, contact me.

2012-02-13

ATtiny13 as IR receiver


ATtiny13 as IR receiver

IR controller is everywhere in your home, but you can not control anything. But if you can follow me to make a very cheap PCB with ATtiny13, you can control most equipment in your home. Out of your home? Maybe more interesting.

What's that stuff?

Generally, it's a PCB can receive IR signal from your IR controller, and control other things like light or TV.Just with a relay.

Why ATtiny13?

ATtiny13 is an ATMEL product, AVR family. This chip is very cheap. If your chip is cheap enough, you can use it anywhere you want.

The chip's power:

1. Small: only 8 pin, 6 GPIO
2. Cheap: $0.3
3. Powerful: 1KB Flash, 64 Byte RAM, 64 Byte EEPROM, up to 20MHz clock
4. Save energy: 240 uA @ 1MHz, <1uA @ power-down mode
5. Most important reason: I like it, :-)

How to recognize IR signal?

Most popular IR signal is defined by NEC. First 9ms signal, 4.5ms no signal, and then is 4 byte data, front 2 byte is address, last 2 byte is reverse to each other.

If chip's software, you need State Machine to change state by input signal right now, and put it in 1 bit. Because ATtiny13 only have 64 Byte RAM, so you can not save all voltage one by one. Transform it immediately, you will only need 4 Byte RAM.

Want debug it? ATtiny13 do not have a serial port, so you can write some status to EEPROM, and read it out later. Maybe it's hard, but it's cheap.

Some IR data

My MacRemote's address code is 0x77e1, yours maybe difference. Signal of buttons:

OK=0x3a5d
Vol+=0x505d
Vol-=0x305d
<<=0x905d
>>=0x605d
MENU=0xc05d
>||=0xfa5d

More feature with IR

IR is very useful for control, you can use it transfer serial port with 9600 bps. And you can use anyother device as adapter between IR and other channel, for example your cell phone as adapter communicate with GSM/3G/4G network.

If you are interested with IR, contact me.

2010-04-03

Unix Geek的10个顶级Mac OS X技巧

Unix Geek的10个顶级Mac OS X技巧

作者: Brian Jepson
译者: gashero
原作题目: Top Ten Mac OS X Tips for Unix Geeks
原作日期: 2007-05-15
翻译日期: 2009-08-25
地址: http://macdevcenter.com/pub/a/mac/2002/10/22/macforunix.html

目录

  • 1 我的shell在哪
  • 2 sudo而不是su
  • 3 启动项
  • 4 文件系统布局
  • 5 隐藏文件的不同方式
  • 6 别名与链接
  • 7 X11
  • 8 Fink
  • 9 /etc并不总是托管一切
  • 10 shutdown并不是真的

Note

编者注:

我们注意到这篇有5年历史的文章仍然被大量的关注,于是我们最近联系了Brian Jepson更新了部分章节。本文就是他对"oldie but goodie"的更新。

距离上次我与Ernie Rothman编写《Mac OS X for Unix Geeks》已经多年了,我发现top10技巧已经有所改变。这些技巧会告诉你Mac OS X与你喜欢的Unix之间有什么不同;帮助你找回使用Unix的技艺;和使用ports开源软件的功能。

1 我的shell在哪

一个Unix极客没有shell是走不远的,不对么?你可以在Finder的 /Applications/Utilities 中找到终端程序。可以把终端拖放到你的dock中来快速启动。

启动终端以后,你可以使用缺省shell,bash。你也可以在菜单中定制终端的显示与设置。你也可以设置终端菜单的属性使其启动。

2 sudo而不是su

默认情况下Mac OS X上是禁用root用户的。如果你需要用root做一些事情,使用 sudo 命令。只要把需要执行的命令放在 sudo 之后即可,例如 sudo vi /etc/hostconfig 。主用户默认拥有这个权限。

如果你需要root的shell,可以使用 sudo tcsh 或者 sudo bash 。如果你想启用root,最简单的方式是给root一个密码 sudo passwd root 。你也可以进入系统设置(System Preferences),选择账户(Account),登录选项(Login Options),选择和显示登录窗口为 "to name and password" 。然后你可以注销并使用root登录了。

3 启动项

Mac OS X的启动方式不像其他Unix系统。MacOSX没有 /etc/init.d 目录。他寻找启动项通过 launchd 程序。你可以在 in this ADC article 了解更多的内容。

4 文件系统布局

当你打开Finder中硬盘顶层目录,你会看到熟悉的 /var /usr 不见了。她们实际上是隐藏了。如果你打开一个终端,使用"ls /"还是可以看到的,还有一些其他目录,如 /Library /Developer

如下表格列出了一些你可能看到的目录(附录A可以看到更详细的列表):

  1. .DS_Store :Finder的配置
  2. .Spotlight-V100 :包含Spotlight配置
  3. .Trashes :这个目录包含的文件在回收站(Trash)中
  4. .fseventsd :由文件系统事件框剪所使用的
  5. .hotfiles.btree :由Mac OS X的Hot-File-Adaptive-Clustering功能跟踪高频率使用的文件
  6. .vol/ :这个目录映射HFS+文件ID到文件
  7. Applicatioins/ :包含所有Mac OS X应用程序,查看 Utilies/ 子目录可以看到很多有趣的工具
  8. DesktopDB, Desktop DF :经典的Mac OS X桌面数据库
  9. Desktop Folder/ :Mac OS 9桌面目录
  10. Developer/ :开发者工具与文档,仅在你安装了开发者工具后才会有
  11. Library/ :本地应用所需的支持文件
  12. Network/ :网络挂载应用、库和用户目录,如同服务器目录
  13. Shared Items/ :由Mac OS 9用于用户间共享的目录
  14. System Folder/ :Mac OS 9系统目录
  15. System/ :包含系统和应用支持文件
  16. Temporary Items/ :Mac OS 9的临时文件
  17. TheVolumeSettingsFolder/ :用于跟踪打开窗口和桌面打印机细节的目录
  18. Trash/ :Mac OS 9的Trash目录
  19. Users/ :用户主目录
  20. VM Storage :Mac OS 9虚拟内存文件
  21. Volumes/ :包含所有挂载的文件系统
  22. automount/ :处理静态NFS挂载的目录
  23. bin/ :基本系统二进制文件
  24. cores/ :如果起用了core dumps(通过tcsh的limit或bash/sh的ulimit),就会在这个目录创建 core.pid
  25. dev/ :包含描述多种设备的文件
  26. etc/ :包含系统级配置
  27. mach :符号链接到 /mach.sym 文件
  28. mach.sym :内核符号
  29. mach_kernel :Darwin内核
  30. private/ :包含tmp、var、etc、cores目录
  31. sbin/ :系统管理和配置的可执行文件
  32. tmp/ :临时文件
  33. usr/ :包含BSD Unix应用和支持文件
  34. var/ :包含频繁修改的文件

5 隐藏文件的不同方式

有如其他Unix,你可以在文件名前加上"."来使其隐藏,例如 /.vol 。这在Finder中是有效的,不过在"ls -a"时却会显示出来。

Mac OS X使用根目录的 .hidden 文件管理需要在Finder中隐藏的文件列表。

同样,HFS+(Mac OS的文件系统)文件和目录可以有一个隐藏属性,通过SetFile命令来设置, SetFile -a V 。这个命令在 /Developer/Tools 目录,需要在安装开发工具后才可用。设置在你重新启动Finder前不会自动生效。你可以注销后重新登录,或者使用苹果菜单的强制退出。你也可以关闭隐藏 属性,通过 SetFile -a v 。查看SetFile的man手册了解更多。注意拥有隐藏属性的文件只是在Finder中隐藏,而ls命令仍然可以看到。

6 别名与链接

有两种方式创建连接。第一种是在Finder中拖动文件到新位置时按住Option和Command键,或在文件菜单选择"Make Alias"。这会创建Mac OS别名,Cocoa、Carbon、Classis应用都可以跟踪的。不过Unix应用会忽略这些连接,把他们看作0字节的文件。

你也可以用 "ln" 或 "ln -s" 。如果用这种方式,Unix、Cocoa、Carbon、Classic应用都可以接受。

7 X11

Mac OS X并没有使用X Window系统。而是使用本地高级图形显示系统叫做Aqua。但是,如果你想要运行X11应用,你也很幸运:Apple有它自己的X11实现,并可以与 Aqua很好的集成,你可以在Mac OS X的安装CD中找到安装包。如果没有缺省安装,运行安装后,你就可以与Mac OS X一样的使用了。

8 Fink

是不是找不到一些Unix或Linux应用程序了?看看 Fink Project 吧,它通过修改开源软件,以便可以运行在Mac OS X。Fink已经包含了很多应用,更多的正在移植中。

9 /etc并不总是托管一切

如果你从其他Unix转到Mac OS X,你可能希望通过 /etc/passwd /etc/group 文件添加用户和组。缺省时,Mac OS X只在单用户模式使用这些文件。如果你想添加用户和组,你需要进入目录服务数据库,一个本地信息库。对于更多信息,查看ADC文章 http://developer.apple.com/documentation/Porting/Conceptual/PortingUnix/additionalfeatures/chapter_10_section_9.html

10 shutdown并不是真的

在相当长的一段时间里,Mac OS X无法在shutdown时执行自定义动作。SystemStarter框架可以执行自定义shutdown动作。更多信息参考 http://www.macdevcenter.com/pub/a/mac/2003/10/21/startup.html

2010-04-01

使用Nokia N810做私人服务器

使用Nokia N810做私人服务器

作者: gashero
日期: 2010-02-01

目录

  • 1 简介
  • 2 一切之根本-SSH
  • 3 常用shell工具
  • 4 公网访问-SSH隧道
  • 5 Subversion服务器
  • 6 备份与同步-rsync

1 简 介

Nokia N810上拥有一个较为完整的Linux操作系统,可以用于在其上安装各类自己的服务器软件。作为一个可以移动的服务器,会提供诸多便利。

2 一 切之根本-SSH

N810上虽然配备了硬件qwerty键盘,但是还是可以通过电脑远程操作为好,而且后面也会涉及到SSH通道上的通信。

可以通过Maemo4上的Application Manager安装软件。有两种SSH实现可选,dropbear和OpenSSH。强烈建议使用后者,有两个理由:

  1. dropbear的SSH客户端不支持-R选项中的任意地址
  2. dropbear的SSH服务器在待机(屏幕关闭)时,有时会卡住

所以,不多说了,安装OpenSSH吧,客户端和服务器一起装上。

安装以后用N810上ssh客户端去搞到客户端(你自己的电脑)上的id_dsa.pub文件,导入到 ~/.ssh/authorized_keys 文件里。就可以不用密码登录了。这里谈到密码,root和user(N810的默认用户名)的密码我都不知道。而且曾经因为修改密码而无法使用。所以建议 各位还是不要修改密码,而是只用key方式登录了事。

Linux下生成id_dsa.pub,或者Windows下用putty的问题,自己去google吧,很简单的。

另外一个关于本地语言配置的问题,修改 /etc/ssh/sshd_config ,将其中 AcceptEnv LANG LC_* 这一行注释掉。否则后面使用subversion时会有一些无聊的警告。

3 常 用shell工具

N810虽然提供了xterm,不过作为一个嵌入式系统,还是有很多必备工具没有给预备好。可以自己去安装需要用的软件。这里提及几个:

  1. tar
  2. gzip
  3. sudo

这些工具在Application Manager里面都可以找到。

4 公 网访问-SSH隧道

N810作为服务器的优势是便于携带,在任何想要使用的地方随手启动与连接网络,所以固定访问点就成了问题。这时可以使用SSH隧道,将N810的 ssh访问挂载到一个网上固定服务器的端口上,具体可用命令:

ssh -f -N -g -R ::localhost:22 @

其中将port修改为公网服务器上的服务端口,user和server则是公网服务器的用户名与地址。这样,你就可以通过 : 来访问N810的ssh服务了。如果在N810上还有其他服务,可以修改命令中的22为你需要的端口。

由于ssh隧道具备自动重连功能,所以如果你只是出去吃个饭之类的,再回来,重新连接网络后,这个隧道仍然有效。不过超过1小时就不行了。

5 Subversion 服务器

也许你可以想像到一个服务器应该有的诸多功能,Subversion对我来说就是很必要的一个。

由于N810只有128MB内存+128MB交换空间,所以服务器这个东西,最好还是随用随启动,用完就关闭。所以就不建议启动 subversion的daemon进程了。可以使用访问协议 svn+ssh 来实现ssh通道上的svn服务。其优点是仅在使用时才启动svn的服务器进程,用完自动关闭。而且通过ssh通道,也就不用另外指定端口了。

不过这里还有个问题,就是svn并不支持在 svn+ssh 协议之上指定除22以外的端口号。而N810作为服务器的还需要挂载到其他服务器上的服务端口。这时你可以修改 .subversion/config [tunnels] 段来实现,比如加入如下一条:

n810= /usr/bin/ssh -p65520 --username=user

然后你的svn访问路径就可以是如下的:

svn+n810:///repopath

使用起来是很简单方便的。

再就是对于已有的工作拷贝,删除再重新检出有点麻烦。可以用如下svn的命令修改版本库的地址:

svn switch --relocate   .

这样修改后,就可以直接应用新地址来执行操作了。

6 备 份与同步-rsync

备份之精髓在于数据的冗余存储,而且各个备份最好是从物理上尽可能的隔离。所以这里介绍十分方便的同步备份工具-rsync。

使用rsync你可以将不同机器上两个目录来做同步,会同时保持相同的文件权限。

这里提供一个我备份另外一个机器上svn版本库到N810上的命令,读者可以按照自己需求修改好放入脚本中执行:

#! /usr/bin/env sh
date
rsync -avz @:/repopath /media/mmc1/

这里的目标路径 /media/mmc1 就是N810上的外置存储卡的根目录。