姓名:张广艳
职称:副教授
电话:62783505-8004
邮件:gyzh@tsinghua.edu.cn
教育背景
理学学士 (计算机科学), 吉林大学, 中国, 2000;
工学硕士 (计算机科学与技术), 吉林大学, 中国, 2003;
工学博士 (计算机科学与技术),清华大学, 中国, 2008.
研究领域
大数据计算、存储系统、分布式处理
研究概况
主要从事大数据存储与分析的理论和方法研究,包括数据密集型计算、存储系统与分布式处理等方面。研究得到包括国家杰出青年科学基金、国家重点研发计划、973和863等10余项国家科研项目的支持。近年来提出了大规模存储系统构建及访问的方法与关键技术,有效提高了存储系统的性能、扩展性和可用性。发表学术论文40余篇,其中包括FAST(存储领域最好的国际会议,4篇)、USENIX ATC、ACM TOS、IEEE TC、IEEE TPDS等计算机系统领域顶级国际会议或期刊论文20余篇。近五年以第一发明人获得国家发明专利授权8项、美国发明专利授权2项。研究成果转化到多家国内骨干企业的存储产品中,效果良好。指导的硕士生中2人获得“清华大学优秀硕士学位论文”奖。
主要创新性研究
(1)存储阵列的架构与数据布局:提出了存储阵列的I/O资源共享架构及弹性数据布局理论,设计了存储阵列规模按需伸缩和出错快速恢复的关键方法,突破了传统存储阵列性能与重构模式的瓶颈。在保持前台 I/O 高性能的同时,可以在较短时间内完成扩展和恢复等后台操作,且保证前后台并发操作中的数据一致性。
(2)公共云存储的数据组织与在线恢复:提出了冗余存储中计算与传输高效调度理论,设计了大规模分布式存储系统容错方法,有效克服了数据可靠、存储成本与 I/O 性能之间的矛盾。在保持存储系统中数据高可用的同时,提高了存储系统的写性能、降级读性能和恢复性能。
(3)数据密集型计算中的数据组织与高效访问:提倡存算一体的协同理论与系统架构,设计了领域专用的数据压缩表示与高效访问方法,拓宽了大数据系统 I/O 能力的优化思路。利用大数据计算的行为特征减少了数据访问量,在保持计算并行性的同时提高了数据访问局部性,进而通过优化数据访问效率提高了大数据计算的性能。
学术成果
[1]. Tianyang Jiang, Guangyan Zhang, Zican Huang, Xiaosong Ma, Junyu Wei, Zhiyue Li, Weimin Zheng. FusionRAID: Achieving Consistent Low Latency for Commodity SSD Arrays. in the Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST'21), Santa Clara, CA, February 2021. Pages 355-370.
[2]. Junyu Wei, Guangyan Zhang, Yang Wang, Zhiwei Liu, Zhanyang Zhu, Junchao Chen, Tingtao Sun, Qi Zhou. On the Feasibility of Parser-based Log Compression in Large-Scale Cloud Systems. in the Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST'21), Santa Clara, CA, February 2021. Pages 249-262.
[3]. Xiaqing Li, Guangyan Zhang, Weimin Zheng. SmartTuning: Selecting HyperParameters of a ConvNet System for Fast Training and Small Working Memory. IEEE Transactions on Parallel and Distributed Systems, Volume: 32, Issue: 7, Pages: 1690-1701. July 2021.
[4]. Guangyan Zhang, Zhufan Wang, Xiaosong Ma, Songlin Yang, Zican Huang, Weimin Zheng. Determining Data Distribution for Large Disk Enclosures with 3-D Data Templates. ACM Transactions on Storage, Volume 15, Issue 4, Article No. 27, December 2019. 1-38 pages.
[5]. Zhufan Wang, Guangyan Zhang, Yang Wang, Qinglin Yang, Jiaji Zhu. Dayu: Fast and Low-interference Data Recovery in Very-large Storage Systems. in the Proceedings of the 2019 USENIX Annual Technical Conference (ATC'19), Renton, WA, July 2019. Pages 993-1007.
[6]. Chengwen Wu, Guangyan Zhang, Yang Wang, Xinyang Jiang, Weimin Zheng. Redio: Accelerating Disk-based Graph Processing by Reducing Disk I/Os. IEEE Transactions on Computers, Volume: 68, Issue: 3, Page(s): 414 - 425. March 2019.
[7]. Xiaqing Li, Guangyan Zhang, Zhufan Wang, Weimin Zheng. HyConv: Accelerating Multi-phase CNN Computation by Fine-grained Policy Selection. IEEE Transactions on Parallel and Distributed Systems, Volume: 30, Issue: 2, Page(s): 388 - 399. Feb. 2019.
[8]. Guangyan Zhang, Zican Huang, Xiaosong Ma, Songlin Yang, Zhufan Wang, Weimin Zheng. RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures. in the Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST'18), Oakland, CA, February 2018. Pages 279-293.
[9]. Guangyan Zhang, Shuhan Cheng, Jiwu Shu, Qingda Hu, and Weimin Zheng. Accelerating Breadth-First Graph Search on a Single Server by Dynamic Edge Trimming. Journal of Parallel and Distributed Computing (JPDC), Volume 120, Pages 383-394, October 2018.
[10]. Dawei Sun, Guangyan Zhang, Chengwen Wu, Keqin Li, and Weimin Zheng. Building a Fault Tolerant Framework with Deadline Guarantee in Big Data Stream Computing Environments, Journal of Computer and System Sciences, Volume 89, November 2017, Pages 4-23.
[11]. Guangyan Zhang, Guiyong Wu, Yu Lu, Jie Wu, and Weimin Zheng. Xscale: Online X-code RAID-6 Scaling Using Lightweight Data Reorganization.IEEE Transactions on Parallel and Distributed Systems, Volume: 27, Issue: 12, Page(s): 3687 - 3700. Dec. 2016.
[12]. Xiaqing Li, Guangyan Zhang, H. Howie Huang, Zhufan Wang and Weimin Zheng. Performance Analysis of GPU-based Convolutional Neural Networks, in the Proceedings of the 45th International Conference on Parallel Processing (ICPP-2016), Philadelphia, PA USA, August 2016.
[13]. Shuhan Cheng, Guangyan Zhang, Jiwu Shu, Qingda Hu, and Weimin Zheng. FastBFS: Fast Breadth-First Graph Search on a Single Server, in the Proceedings of the 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS'16), Chicago, Illinois USA, May 2016.
[14]. Guangyan Zhang, Guiyong Wu, Shupeng Wang, Jiwu Shu, Weimin Zheng, and Keqin Li. CaCo: An Efficient Cauchy Coding Approach for Cloud Storage Systems. IEEE Transactions on Computers, Volume: 65, Issue: 2, Page(s): 435 - 447. Feb. 2016.
[15]. Dawei Sun, Guangyan Zhang, Songlin Yang, Weimin Zheng, Samee Khan, and Keqin Li. Re-Stream: real-time and energy-efficient resource scheduling in big data stream computing environments, Information Sciences, Volume 319, Pages 92-112, October 2015.
[16]. Guangyan Zhang, Jigang Wang, Keqin Li, Jiwu Shu, and Weimin Zheng, Redistribute Data to Regain Load Balance during RAID-4 Scaling. IEEE Transactions on Parallel and Distributed Systems,Volume: 26, Issue: 1, Page(s): 219 - 229. Jan. 2015.
[17]. Guangyan Zhang, Keqin Li, Jingzhe Wang, Weimin Zheng, Accelerate RDP RAID-6 Scaling by Reducing Disk I/Os and XOR Operations. IEEE Transactions on Computers,Volume: 64, Issue: 1, Page(s): 32 - 44. Jan. 2015.
[18]. Guangyan Zhang, Weimin Zheng, Keqin Li, Rethinking RAID-5 Data Layout for Better Scalability. IEEE Transactions on Computers, Volume: 63, Issue: 11, Page(s): 2816 - 2828. Nov. 2014.
[19]. Zhang, G., Zheng, W., and Li, K. 2013. Design and evaluation of a new approach to RAID-0 scaling. ACM Transactions on Storage, 9, 4, Article 11 (November 2013), 31 pages.
[20]. Weimin Zheng, Guangyan Zhang. FastScale: Accelerate RAID Scaling by Minimizing Data Migration. in the Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST'11), San Jose, CA, February 2011.
[21]. Yang Wang, Jiwu Shu, Guangyan Zhang, Wei Xue, Weimin Zheng. SOPA: Selecting the Optimal Policy Adaptively. ACM transactions on storage, Volume 6 Issue 2, July 2010 .
[22]. Guangyan Zhang, Weimin Zheng, and Jiwu Shu. ALV: A New Data Redistribution Approach to RAID-5 Scaling. IEEE Transactions on Computers, Vol. 59, No. 3. pp. 345-357, March 2010.
[23]. Guangyan Zhang, Jiwu Shu, Wei Xue, and Weimin Zheng. Design and Implementation of an Out-of-Band Virtualization System for Large SANs. IEEE Transactions on Computers, Vol. 56, No. 12. pp. 1654-1665, Dec 2007.
[24]. Zhang, G., Shu, J., Xue, W., and Zheng, W. 2007. SLAS: An efficient approach to scaling round-robin striped volumes. ACM Transactions on Storage, Volume 3 Issue 1, March 2007, pp. 1-39.