天文瞬变源快速自动识别系统的研究与实现

吴潮; 马冬; 田海俊; 李乡儒; 魏建彦

doi:10.16383/j.aas.2017.c160289

天文瞬变源快速自动识别系统的研究与实现

doi: 10.16383/j.aas.2017.c160289

吴潮^1,,
马冬^1,2,,
田海俊^2, ,,
李乡儒^3,,
魏建彦^1,

1.
中国科学院国家天文台北京 100012
2.
三峡大学宜昌 443002
3.
华南师范大学广州 510631

基金项目:

国家自然科学基金 U1331202

广东省自然科学基金 2014A030313425

国家自然科学基金 61273248

国家自然科学基金 U1231123

国家自然科学基金 11503012

国家自然科学基金 U1431108

国家自然科学基金 U1731124

详细信息

作者简介:
吴潮中国科学院国家天文台副研究员.主要研究方向为数据挖掘与瞬变源搜索.E-mail:cwu@nao.cas.cn

马冬三峡大学与中国科学院国家天文台硕士研究生.主要研究方向为数据挖掘.E-mail:md201314@yeah.net

李乡儒华南师范大学教授.2006年获得中国科学院自动化研究所博士学位.主要研究方向为数据挖掘与计算机视觉.E-mail:xiangru.li@gmail.com

魏建彦中国科学院国家天文台研究员.主要研究方向为瞬变源观测与科学.E-mail:wjy@nao.cas.cn

通讯作者:
田海俊三峡大学理学院副教授.先后在美国、德国等知名大学或研究机构研究访学.主要研究方向为星系天文学, 天文信息学.本文通信作者.E-mail:hjtian@lamost.org

计量
- 文章访问数: 1644
- HTML全文浏览量: 396
- PDF下载量: 336
- 被引次数: 2
出版历程
- 收稿日期: 2016-03-24
- 录用日期: 2016-12-10
- 刊出日期: 2017-12-20

Study and Development of a Fast and Automatic Astronomical-transient-identification System

WU Chao^1
,,
MA Dong^{1,2
,},
TIAN Hai-Jun^{2
, ,},
LI Xiang-Ru^3
,,
WEI Jian-Yan^1
,

1.
National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012
2.
Three Gorges University of China, Yichang 443002
3.
South China Normal University, Guangzhou 510631

Funds:

National Natural Science Foundation of China U1331202

Natural Science Foundation of Guangdong Province 2014A030313425

National Natural Science Foundation of China 61273248

National Natural Science Foundation of China U1231123

National Natural Science Foundation of China 11503012

National Natural Science Foundation of China U1431108

National Natural Science Foundation of China U1731124

More Information

Author Bio:
Associate professor at National Astronomical Observatories, Chinese Academy of Sciences. His research interest covers data mining and astronomical transient search

Master student at Three Gorges University and National Astronomical Observatories, Chinese Academy of Sciences. His main research interest is data mining

Professor at South China Normal University. He received his Ph. D. degree from the Institute of Automation, Chinese Academy of Sciences in 2006. His research interest covers data mining and robust vision

Professor at National Astronomical Observatories, Chinese Academy of Sciences. His research interest covers observation and science of astronomical transients

Corresponding author: TIAN Hai-Jun Associate professor at the College of Science, Three Gorges University. His research interest covers galactic astronomy and astroinformatics. Corresponding author of this paper

摘要

摘要: 大视场和高时间采样率是现代天文光学瞬变源巡天项目的两个主要发展方向，相对传统的巡天项目将会产生更大的数据量和要求更快的瞬变源识别处理速度.为满足新技术下的瞬变源识别处理要求，本文提出用基于等光度测量星像轮廓等13个新的特征参量取代原有的轮廓拟合参量；使用实际星像轮廓仿真和构建较真实的训练样本算法；加入基于实测数据分析的噪声过滤判据等方法.实现了基于随机林森算法的天文光学瞬变源自动快速识别系统.通过仿真和实测数据的测试表明：本识别系统较国际主流的同类识别算法提速约10倍，样本识别的总体正确检出率和错误检出率都基本相同，而在低信噪比处，本文的识别算法有较良好的表现.本识别系统已成功应用于我国的迷你地基广角相机阵（地基广角相机阵的先导项目），同时，本系统对于其他天文光学瞬变源巡天项目也有着重要的应用价值.
- 机器学习 /
- 随机森林 /
- 瞬变源自动搜寻 /
- 星像轮廓 /
- 等光度测光
Abstract: With the development of observational technology, modern transient survey projects are required to select the transient candidates fast and automatically from large volume data with noise. We present a fast and automatic identification system to search transients by the following methods:introducing 13 new features to measure objects' profiles by isophotometry in the place of PSF fit, using high simulation data based on real objects' profiles as training sample, and designing a special noise filter function. The identification system is realized by supervised machine learning technique of random forest. Our test demonstrates that the processing speed is 10 times faster than the popular identification system in the world, while their true and false positive rates are at the same level. Additionally, our system shows good performance for low signal-to-noise-ratio data due to its isophotometry's features. Our system has been successfully operating in the Mini-GWAC (Miniature ground wide angle camera) online data processing pipeline.
- Machine learning /
- random forest /
- robotic identification of transient /
- profile of star /
- isophotometry
注释:

1) 本文责任编委胡清华

HTML全文

本文责任编委胡清华

图 1 图像相减法处理示例

Fig. 1 An example of difference image procedure

下载: 全尺寸图片幻灯片

图 2 三种不同的轮廓测量方法示例

Fig. 2 The profile measurements by three different methods

下载: 全尺寸图片幻灯片

图 3 仿真瞬变源样本构建过程

Fig. 3 The flowchart of transients simulation

下载: 全尺寸图片幻灯片

图 4 瞬变源候选体的搜索流程图

Fig. 4 The flowchart of transient candidates search

下载: 全尺寸图片幻灯片

图 5 新特征参量的有效性

Fig. 5 Effectiveness of the new features

下载: 全尺寸图片幻灯片

图 6 一个真实的瞬变源耀星

Fig. 6 An example transient of flare star

下载: 全尺寸图片幻灯片

表 1 特征参量

Table 1 Feature sets

组号	序号	特征参量	参量描述	权重	排序	来源
Ⅰ	1	flux_radius2	20%能量处的像斑孔径大小(单位:像素)	0.1391	1	本文新参量
	2	flux_radius1	10%能量处的像斑孔径大小(单位:像素)	0.0548	6
	3	flux_aper	固定孔径($r$ = 2.5像元)的流量	0.0287	13
	4	ISO 0	等光度区域0的面积(单位:像素平方)	0.0559	5
	5	ISO 1	等光度区域1的面积(单位:像素平方)	0.0308	11
	6	ISO 2	等光度区域2的面积(单位:像素平方)	0.0145	20
	7	ISO 3	等光度区域3的面积(单位:像素平方)	0.0145	19
	8	ISO 4	等光度区域4的面积(单位:像素平方)	0.0099	23
	9	r_max_aper	最大像元光度流量与固定孔径流量之比	0.1056	3
	10	r_aper_ISO	固定孔径流量与等光度流量之比	0.0295	12
	11	r_aper_ISOCOR	固定孔径流量与修正等光度流量之比	0.0213	16
	12	mag_err_aper	星等的均方根误差	0.0349	10
	13	class_star	恒星与星系分类标识(取值: 0~1)	0.0072	25
Ⅱ	14	diffsum	在矩阵$R(d)$上, 以对象为中心构成的5×5矩阵中所有元素的和	0.0512	7	文献[8]
	15	colmeds	在矩阵$B(d)$上, 每列元素中位数的最大值	0.0152	18
	16	numneg	在矩阵$R(d)$上, 以对象为中心构成的7×7矩阵中负元素的个数	0.0086	24
	17	a_image	长轴方向上的均方根, 来自SExtractor	0.0138	21
	18	b_image	短轴方向上的均方根, 来自SExtractor	0.1152	2
	19	ellipticity	1-b_image/a_image, 来自SExtractor	0.0908	4
	20	flags	SExtractor在矩阵$I(d)$上的提取标志, 来自SExtractor	0.0404	8
	21	mag_aper	固定孔径的星等, 来自于SExtractor	0.0361	9
	22	n2sig3	在矩阵$R(d)$上, 以对象为中心构成的5×5矩阵中元素值＜-2的个数	0.0273	14
	23	n3sig3	在矩阵$R(d)$上, 以对象为中心构成的5×5矩阵中元素值＜-3的个数	0.0228	15
	24	n3sig5	在矩阵$R(d)$上, 以对象为中心构成的7×7矩阵中元素值＜-3的个数	0.0188	17
	25	n2sig5	在矩阵$R(d)$上, 以对象为中心构成的7×7矩阵中元素值＜-2的个数	0.0135	22
Ⅲ		r_aper_psf	(flux_aper+flux_psf)/flux_psf	0.148		文献[8]
		flux_ratio	矩阵$I(d)$上以对象为中心的5个像素上的流量值与矩阵$I(t)$上以对象为中心的5个像素上流量值的绝对值之比	0.037
		n3sig3shift	矩阵$R(d)$上以对象为中心构成的5×5矩阵中元素≥ 3的个数与矩阵$R(t)$上以对象为中心构成的5×5矩阵中元素大于等于3的个数之差	0.019
		n3sig5shift	矩阵$R(d)$上以对象为中心构成的7×7矩阵中元素≥ 3的个数与矩阵$R(t)$上以对象为中心构成的7×7矩阵中元素大于等于3的个数之差	0.018
		n2sig3shift	矩阵$R(d)$上以对象为中心构成的5×5矩阵中元素≥ 2的个数与矩阵$R(t)$上以对象为中心构成的5×5矩阵中元素大于等于2的个数之差	0.014
		n2sig5shift	矩阵$R(d)$上以对象为中心构成的7×7矩阵中元素≥ 2的个数与矩阵$R(t)$上以对象为中心构成的7×7矩阵中元素大于等于2的个数之差	0.012

下载: 导出CSV

表 2 随机森林主要参数

Table 2 The main parameters of random forest

超参数名称	取值	描述
n_estimators	100	随机森林中树的个数
criterion	entropy	决定树中节点是否进行分割的决策函数
n_jobs	-1	随机森林中并行训练树的个数, -1表示并行训练树的个数等于计算机CPU的核数
max_features	5	训练节点时无放回随机抽取的最大特征维数
min_samples_split	3	训练分割节点时需要的最少样本数
max_depth	unlimited	随机森林中树的最大深度

下载: 导出CSV

参考文献(12)

[1]	Perlmutter S, Aldering G, Goldhaber G, Knop R A, Nugent P, Castro P G, Deustua S, Fabbro S, Goobar A, Groom D E. Measurements Ω of Λ and from 42 high-redshift supernovae. The Astronomical Journal, 1999, 517(2):565-586 https://www.physics.rutgers.edu/grad/690/Mar13-Hovey.pdf
[2]	Riess A G, Filippenko A V, Challis P, Clocchiatti A, Diercks A, Garnavich P M, Gilliland R L, Hogan C J, Jha S, Kirshner R P, Leibundgut B, Phillips M M, Reiss D, Schmidt B P, Schommer R A, Smith R C, Spyromilio J, Stubbs C, Suntzeff N B, Tonry J. Observational evidence from supernovae for an accelerating universe and a cosmological constant. The Astronomical Journal, 1998, 116(3):1009-1038 doi: 10.1086/300499
[3]	吴潮, 张天萌, 王晓峰, 裘予雷.超新星宇宙学的观测与研究进展.天文学进展, 2013, 31(1):37-55 http://www.doc88.com/p-9179977318136.html Wu Chao, Zhang Tian-Meng, Wang Xiao-Feng, Qiu Yu-Lei. Supernova cosmology:observations and progress. Progress in Astronomy, 2013, 31(1):37-55 http://www.doc88.com/p-9179977318136.html
[4]	Bailey S, Aragon C, Romano R, Thomas R C, Weaver B A, Wong D. How to find more supernovae with less work:object classification techniques for difference imaging. The Astronomical Journal, 2007, 665(2):1246-1253 https://arxiv.org/abs/0705.0493
[5]	Brink H, Richards J W, Poznanski D, Bloom J S, Rice J, Negahban S, Wainwright M. Using machine learning for discovery in synoptic survey imaging data. Monthly Notices of the Royal Astronomical Society, 2013, 435(2):1047-1060 doi: 10.1093/mnras/stt1306
[6]	Bloom J S, Richards J W, Nugent P E, Quimby R M, Kasliwal M M, Starr D L, Poznanski D, Ofek E O, Cenko S B, Butler N R, Kulkarni S R, Gal-Yam A, Law N. Automating discovery and classification of transients and variable stars in the synoptic survey era. Publications of the Astronomical Society of the Pacific, 2012, 124(921):1175-1196 doi: 10.1086/668468
[7]	Buisson du L, Sivanandam N, Bassett B A, Smith M. Machine learning classification of SDSS transient survey images. Monthly Notices of the Royal Astronomical Society, 2015, 454(2):2026-2038 doi: 10.1093/mnras/stv2041
[8]	Goldstein D A, D'Andrea C B, Fischer J A, Foley R J, Gupta R R, Kessler R, Kim A G, Nichol R C, Nugent P E, Papadopoulos A, Sako M, Smith M, Sullivan M, Thomas R C, Wester W, Wolf R C, Abdalla F B, Banerji M, Benoit-Lévy A, Bertin E, Brooks D, Rosell A C, Castander F J, Costa L N D, Covarrubias R, DePoy D L, Desai S, Diehl H T, Doel P, Eifler T F, Neto A F, Finley D A, Flaugher B, Fosalba P, Frieman J, Gerdes D, Gruen D, Gruendl R A, James D, Kuehn K, Kuropatkin N, Lahav O, Li T S, Maia M A G, Makler M, March M, Marshall J L, Martini P, Merritt K W, Miquel R, Nord B, Ogando R, Plazas A A, Romer A K, Roodman A, Sanchez E, Scarpine V, Schubnell M, Sevilla-Noarbe I, Smith R C, Soares-Santos M, Sobreira F, Suchyta E, Swanson M E C, Tarle G, Thaler J, Walker A R. Automated transient identification in the dark energy survey. The Astronomical Journal, 2015, 150(3):Article No. 82 http://www.oalib.com/paper/3558300
[9]	Bertin E, Arnouts S. SExtractor:software for source extraction. Astronomy and Astrophysics Supplement Series, 1996, 117:393-404 doi: 10.1051/aas:1996164
[10]	Breiman L, Forests R. Machine Learning. Netherlands:Kluwer Academic Publishers, 2001, 45:5-32
[11]	方匡南, 吴见彬, 朱建平, 谢邦昌.随机森林方法研究综述.统计与信息论坛, 2011, 26(3):32-38 http://dspace.xmu.edu.cn/handle/2288/112057?show=full Fang Kuang-Nan, Wu Jian-Bin, Zhu Jian-Ping, Xie Bang-Chang. A review of technologies on random forests. Statistics and Information Forum, 2011, 26(3):32-38 http://dspace.xmu.edu.cn/handle/2288/112057?show=full
[12]	黄衍, 查伟雄.随机森林与支持向量机分类性能比较.软件, 2012, 33(6):107-110 http://www.docin.com/p-497267267.html Huang Yan, Zha Wei-Xiong. Comparison on classification performance between random forests and support vector machine. Software, 2012, 33(6):107-110 http://www.docin.com/p-497267267.html

施引文献

期刊类型引用(1)

祝杰，周丹，郑立新，曹建军，药新雨，陈国平，于涌，葛健，唐正宏，潘翔，杨臣威，姜鹏. 基于漂移扫描CCD技术的南极时域天文观测阵原型机的设计与实现. 中国科学:物理学力学天文学. 2024(08): 115-127 .

百度学术

其他类型引用(1)

资源附件(0)

访问统计

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

天文瞬变源快速自动识别系统的研究与实现

doi: 10.16383/j.aas.2017.c160289

通讯作者:
田海俊三峡大学理学院副教授.先后在美国、德国等知名大学或研究机构研究访学.主要研究方向为星系天文学, 天文信息学.本文通信作者.E-mail:hjtian@lamost.org

计量

Study and Development of a Fast and Automatic Astronomical-transient-identification System

Corresponding author: TIAN Hai-Jun Associate professor at the College of Science, Three Gorges University. His research interest covers galactic astronomy and astroinformatics. Corresponding author of this paper

期刊类型引用(1)

其他类型引用(1)

计量

目录

留言板

天文瞬变源快速自动识别系统的研究与实现

doi: 10.16383/j.aas.2017.c160289

通讯作者: 田海俊 三峡大学理学院副教授.先后在美国、德国等知名大学或研究机构研究访学.主要研究方向为星系天文学, 天文信息学.本文通信作者.E-mail:hjtian@lamost.org

计量

出版历程

Study and Development of a Fast and Automatic Astronomical-transient-identification System

Corresponding author: TIAN Hai-Jun Associate professor at the College of Science, Three Gorges University. His research interest covers galactic astronomy and astroinformatics. Corresponding author of this paper

期刊类型引用(1)

其他类型引用(1)

计量

出版历程

目录

通讯作者:
田海俊三峡大学理学院副教授.先后在美国、德国等知名大学或研究机构研究访学.主要研究方向为星系天文学, 天文信息学.本文通信作者.E-mail:hjtian@lamost.org