In an ongoing research project, we consider Twitter, the social network service, as a market, in which people produce and consume information by tweeting and following others. Let’s say user A and user B are two frequent tweeters. One interesting question would be “how similar” is the information contained in A’s and B’s tweets. One way to solve this question is of course do a semantic analysis of the two’s tweets. Another way, as we propose, is look at their respective followers. Presumably everyone has a preference on what kind of information she consumes, so her following someone should tell that, to some extent, she likes the one’s tweets. Therefore we can predict that if A’s and B’s tweets are similar, then they should have followers of similar preferences; inversely if A’s and B’s tweets are quite different, then the followers they attract should have quite different preferences.
We define Followers-Similarity-Index (FSI) of A and B as
For fun, I computed the FSIs for the following Chinese twitter users @flypig, @junyu, @turingbook, @mozhixu, @glif, @williamlong, @DashHuang, @Stefsunyanzi, @virushuo, @WangShuo, @xiaolai, @zuola, @wglxh, @wangpei, @gaojiamin, @ag108lau, @arthur369, @mranti, @songshinan, @hecaitou, @duanzi, @isaac, @shizhao, @luoyonghao, @jaqi, @jason5ng32, @maoz, @izlmichael, @roseluqiu, @livid, @onlyswan, @aiww, @fzhenghu, @zhangfacai. There are 34 people and hence combinations. I used the data collected at 0:00 Feb 17 CST. Below are the 561 computed FSIs sorted.
Tag Archive for 'Economist'
劳动经济学中的一个老问题是学历对收入到底有多大的影响,用数学方程式来表达就是
收入=A*受教育年限+其它影响收入的因素
出于某种奇怪心理的驱使,一些经济学家就对这个 A 有兴趣,甚至想从数据中算出 A 是多少。学过统计学的同学都知道,简单的用收入数据和教育年限数据做回归是得不到正确的 A 值的,因为这种简单方法正确的前提是一个人的“受教育年限”与“其它影响这个人收入的因素”不相关。但是这个假设在这里不成立,理由是一个人看不见摸不着的能力是影响收入的重要因素,同时也会影响她的受教育年限。所以简单回归得到的 A 值同时包含了教育和能力的因素,可能过高地估计受教育年限对收入的影响。
为了得到正确的 A,计量上的办法是找到另外一个指标,这个指标只和受教育年限相关,但是和“其它影响收入的因素”比如能力不相关。一些经济学家建议使用的指标是“一个人有没有姐妹”,因为人口统计学发现有姐妹的人的受教育年限平均低于没有姐妹的人,而看起来一个人的能力应该与她有没有姐妹不相关。
最近看到的一篇文章说:不对,一个人的能力和她有没有姐妹是相关的!逻辑是这样:
一个人有没有姐妹和她总共有多少兄弟姐妹相关;兄弟姐妹的总数多,有姐妹的概率就大
–> 兄弟姐妹的总数是其父母最优化生育的结果
–> 最优化生育的决策取决于父母的能力,收入,对下一代数量的偏好、性别组成的变好,对下一代的投资等等因素
–> 其中父母的能力和对下一代的投资都会影像子女的能力
–> 所以一个人的能力和她有没有姐妹是相关的
经济学家的推理,just for fun
This is Michael Zhan Shi's personal blog site . "EConst" is combination of "e, the natural constant" and "economist". 我来自远东古老中国山清水秀的美丽浙江, 现在新大陆的狂野德州求学. I am here sharing my experiences and learnings on our beautiful earth.








