SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.
Citation: | Yong Dou, Jie Zhou, Gui-Ming Wu, Jing-Fei Jiang, Yuan-Wu Lei, Shi-Ce Ni. A Unified Co-Processor Architecture for Matrix Decomposition[J]. Journal of Computer Science and Technology, 2010, 25(4): 874-885. DOI: 10.1007/s11390-010-1068-5 |
[1] |
Farina A, Timmoneri L. Parallel algorithms and processing architectures for space-time adaptive processing. In Proc. Radar CIE International Conference, Beijing, China, October 8-10, 1996, pp.770-774.
|
[2] |
Rabideau D J, Kogon S M. A signal processing architecture for space-based GMTI radar. In Proc. the Record of the IEEE Radar Conference, Waltham, Massachusetts, April 20-22, 1999, pp.96-101.
|
[3] |
Fischer B, Modersitzki J. Fast inversion of matrices arising in image processing. Computer Science, 1999, 22(1): 1-11.
|
[4] |
Batchelor G H. Introduction to Fluid Dynamics. 2nd Edition, Cambridge University Press, 2000.
|
[5] |
Ojalvo I U. Proper use of Lanczos vectors for large eigenvalue problems. Computers & Structures, 1985, 20(1-3): 115-120.
|
[6] |
Buttari A, Langou J, Kurzak J, Dongarra J. Parallel tiled QR factorization for multicore architecture. Concurrency and Computation: Practice and Experience, 2008, 20(13): 1573-1590.
|
[7] |
The LINPACK Benchmark. http://www.netlib. org/linpack/, December, 2008.
|
[8] |
Xu H, Alexander W E. Parallel QR factorization on a block data flow architecture. In Proc. the 24th Southeastern Symposium and the 3rd Annual Symposium on Communications, Signal Processing Expert Systems, and ASIC VLSI Design, March 1-3, 1992, pp.332-336.
|
[9] |
Fernandez L, Garcia J M. The performance of fast Givens Rotation problem implemented with MPI extensions in multicomputer. In Proc. International Conference on Applications of High-Performance Computers in Engineering, Santiago de Compostela, Espagne, July 1997, pp.83-92.
|
[10] |
Ian N Dunn, Gerard G L Meyer. Parallel QR factorization for hybrid message passing/shared memory operation. Journal of the Franklin Institute, 338(5): 601-613.
|
[11] |
Hernandez V, Roman J E, Tomas A. A parallel variant of the gram-Schmidt process with reorthogonalization. In Proc. International Conference on Parallel Computing: Current \& Future Issues of High-End Computing, Malaga, Spain, Sept. 13-16, 2005, pp.221-228.
|
[12] |
Bischof C H. A parallel QR factorization algorithm using local pivoting. SIAM Journal on Scientific and Statistical Computing, 1991, 12(1): 36-57.
|
[13] |
Peng S, Sedukhin S, Sedukhin I. Householder bidiagonalization on parallel computers with dynamic ring architecture. In Proc. the 2nd Aizu International Symposium on Parallel Algorithms/Architecture Synthesis, Aizu-Wakamatsu, Japan, March 17-21, 1997, pp.182-191.
|
[14] |
Rüger G, Schwind M. Comparison of different parallel modified Gram-Schmidt algorithms. In Proc. Euro-Par Parallel Processing, Lisbon, Portugal, Aug. 30-September 2, 2005, pp.826-836.
|
[15] |
Buttari A, Langou J, Kurzak J, Dongarra J. Parallel tiled QR factorization for multicore architectures. Concurrency and Computation: Practice and Experience, 2008, 20(13): 1573-1590.
|
[16] |
Oliveria S, Soma T. New partitioning schemes for parallel modified Gram-Schmidt orthogonalization. In Proc. the 3rd International Symposium on Parallel Architectures, Algorithms and Networks, Las Vegas, Nevada, USA, Dec. 7-9, 1997, pp.233-239.
|
[17] |
Wilburn C Wilburn, Hak-Lim Ko, Winser E Alexander. An algorithm and architecture for the parallel solution of systems of linear equations. In Proc. IEEE Fifteenth Annual International Phoenix Conference on Computers and Communications, Arizona, USA, 1996, pp.392-398.
|
[18] |
Singh C K, Prasad S H, Balsara P T. VLSI architecture for matrix inversion using modified Gram-Schmidt based QR decomposition. In Proc. the 20th International Conference on VLSI Design, Bangalore, India, Jan. 6-10, 2007, pp.836-841.
|
[19] |
Lorenzelli F, Yao K. A linear systolic array for recursive least squares. IEEE Transactions on Signal Processing, 1995, 43(12): 3014-3021.
|
[20] |
Liu K J R, Heieh S F , Yao K. Recursive LS filtering using block Householder transformation. In Proc. IEEE ICASSP, Albuquerque, USA, April 3-6, 1990, pp.1631-1634.
|
[21] |
Tang C F T, Liu K J R, Tretter S A. On systolic arrays for recursive complex Householder transformations with applications to array processing. In Proc. International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, May 14-17, 1991, pp.1033-1036.
|
[22] |
Sergyienko A, Maslennikov O. Implementation of givens QR-decomposition in FPGA. In Proc. Int. Conf. Parallel Processing and Applied Mathematics, Na Leczów, Porland, Sept. 9-12, 2000, pp.458-465.
|
[23] |
Yokoyama Y, Kim M, Arai H. Implementation of Systolic RLS adaptive array using FPGA and its performance evaluation. In Proc. the 64th Vehicular Technology Conference, Montreal, Canada, Sept. 25-28, 2006, pp.1-5.
|
[24] |
Karkooti M, Cavallaro J R, Dick C. FPGA implementation of matrix inversion using QRD-RLS algorithm. In Conference Record of the 39th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, Oct. 30-Nov. 2, 2005, pp.1625-1629.
|
[25] |
Echman F, Owall V. A scalable pipelined complex valued matrix inversion architecture. In Proc. IEEE International Symposium on Circuits and Systems, Kobe, Japan, May 23-26, 2005, pp.4489-4492.
|
[26] |
Kim D, Rajopadhye S. An improved systolic architecture for LU decomposition. In Proc. ASAP 2006, Steamboat Springs, USA, Sept. 11-13, 2006, pp.231-238.
|
[27] |
Choi S, Prasanna V. Time and energy efficient matrix factorization using FPGAs. In Proc. FPL 2003, Lisbon, Portugal, Sept. 1-3, 2003, pp.507-519.
|
[28] |
Yao K, Lorenzelli F. Systolic algorithm and architecture for high-throughput processing applications. Journal of VLSI Signal Processing, 2007, 53(1/2): 15-34.
|
[1] | Yi-Fan Zhang, Lei Sun, Qiang Cao. TLP-LDPC: Three-Level Parallel FPGA Architecture for Fast Prototyping of LDPC Decoder Using High-Level Synthesis[J]. Journal of Computer Science and Technology, 2022, 37(6): 1290-1306. DOI: 10.1007/s11390-022-1499-9 |
[2] | Shu-Quan Wang, Lei Wang, Yu Deng, Zhi-Jie Yang, Sha-Sha Guo, Zi-Yang Kang, Yu-Feng Guo, Wei-Xia Xu. SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array[J]. Journal of Computer Science and Technology, 2020, 35(2): 475-489. DOI: 10.1007/s11390-020-9686-z |
[3] | Ji-Liang Zhang, Wei-Zheng Wang, Xing-Wei Wang, Zhi-Hua Xia. Enhancing Security of FPGA-Based Embedded Systems with Combinational Logic Binding[J]. Journal of Computer Science and Technology, 2017, 32(2): 329-339. DOI: 10.1007/s11390-017-1700-8 |
[4] | Ji-Liang Zhang, Qiang Wu, Yi-Peng Ding, Yong-Qiang Lv, Qiang Zhou, Zhi-Hua Xia, Xing-Ming Sun, Xing-Wei Wang. Techniques for Design and Implementation of an FPGA-Specific Physical Unclonable Function[J]. Journal of Computer Science and Technology, 2016, 31(1): 124-136. DOI: 10.1007/s11390-016-1616-8 |
[5] | Cinzia Bernardeschi, Luca Cassano, Andrea Domenici. SRAM-Based FPGA Systems for Safety-Critical Applications: A Survey on Design Standards and Proposed Methodologies[J]. Journal of Computer Science and Technology, 2015, 30(2): 373-390. DOI: 10.1007/s11390-015-1530-5 |
[6] | Ji-Liang Zhang, Gang Qu, Yong-Qiang Lv, Qiang Zhou. A Survey on Silicon PUFs and Recent Advances in Ring Oscillator PUFs[J]. Journal of Computer Science and Technology, 2014, 29(4): 664-678. DOI: 10.1007/s11390-014-1458-1 |
[7] | Xiaofang (Maggie) Wang, Swetha Thota. A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs[J]. Journal of Computer Science and Technology, 2011, 26(3): 434-447. DOI: 10.1007/s11390-011-1145-4 |
[8] | Hui Dai, Qiang Zhou, Ji-Nian Bian. Multilevel Optimization for Large-Scale Hierarchical FPGA Placement[J]. Journal of Computer Science and Technology, 2010, 25(5): 1083-1091. DOI: 10.1007/s11390-010-1085-4 |
[9] | Ehsan Atoofian, Zainalabedin Navabi. A Test Approach for Look-Up Table Based FPGAs[J]. Journal of Computer Science and Technology, 2006, 21(1): 141-146. |
[10] | Zheng Fangqing. The Decomposition of Belief Function[J]. Journal of Computer Science and Technology, 1992, 7(2): 189-192. |