Technology Topics

This page describes the latest technologies being researched and developed at KIOXIA Corporation and various use cases of flash memories.

Research & Development Field

New memory development

New memory development

New Memory Cell Technology for Terabit-Scale High-Density Application

New Memory Cell Technology for Terabit-Scale High-Density Application

Development of BiCS FLASH™

Development of BiCS FLASH™

3D Semicircular Flash Memory Cell (Twin BiCS FLASH): Novel Split-Gate Technology to Boost Bit Density

3D Semicircular Flash Memory Cell (Twin BiCS FLASH): Novel Split-Gate Technology to Boost Bit Density

TCAD (Technology CAD) development

TCAD (Technology CAD) development

New Evaluation Method for Nanomaterials

New Evaluation Method for Nanomaterials

Process technology

Next-generation lithography process: Nanoimprint

Next-generation lithography process: Nanoimprint

Analytical technologies for next-generation devices

Analytical technologies for next-generation devices

Image processing technology utilizing machine learning

Image processing technology utilizing machine learning

RIE Technology Supporting BiCS FLASH™

RIE Technology Supporting BiCS FLASH™

Development of Monocrystalline Si Channel Process for 3D Flash Memory

Development of Monocrystalline Si Channel Process for 3D Flash Memory

14nm Half-pitch Direct Patterning with Nanoimprint Lithography

14nm Half-pitch Direct Patterning with Nanoimprint Lithography

System technology

HMB (Host Memory Buffer) technology for DRAM-less SSD

HMB (Host Memory Buffer) technology for DRAM-less SSD

Development of high-speed and high-energy-efficiency algorithm and hardware architecture for deep learning accelerator

Development of high-speed and high-energy-efficiency algorithm and hardware architecture for deep learning accelerator

A 25.6Gb/s Interface with Ring Topology for High-Bandwidth and Large-Capacity Storage Systems

A 25.6Gb/s Interface with Ring Topology for High-Bandwidth and Large-Capacity Storage Systems

Production management technology

Factory Innovation

Factory Innovation

Device technology

New Memory Development

We are in the process of developing new memory technologies in order to widen our product portfolio and expand our business. We propose new memory cell technologies to realize even higher bit density file memories, as well as various high-speed nonvolatile memories. For example, we have demonstrated STT-MRAM technology(*1) and ReRAM technology(*2) with the highest density as of the time of publication(*3). As advanced device, process and circuit technologies need to achieve memories with new structures and new materials. We are challenging ourselves with new tasks on a daily basis.
 

*1 Spin Transfer Torque Random Access Memory
(We presented 4Gbit STT-MRAM technology at IEDM with SK-hynix in 2016.)

*2  Resistive Random Access Memory
(We presented 32Gbit ReRAM technology at ISSCC with SanDisk in 2013.)

*3 Figures according to our research.

Memory cell structures presented at the conference (Left: STT-MRAM; right: ReRAM)

Memory cell structures presented at the conference (Left: STT-MRAM; right: ReRAM)

New Memory Cell Technology for Terabit-Scale High-Density Application

New memory cells are actively studied all over the world as a next generation non-volatile memory candidate with high-density and fast access speed. However, capacity of the memories developed so far is limited up to hundreds of gigabits. To overcome this limitation, we have evaluated the issues and solutions for achieving terabit-scale ultra-high density cross-point memory.

The challenge for achieving terabit-scale cross-point memory is to reduce operation current of a memory cell. Recent research pointed out that a memory cell with the operation current below mA is required to solve huge consumption current and voltage drop issues[1].

As a solution, we focused on a new non-volatile memory; Ag ionic memory. The ionic memory has a simple structure, consisting of an active metal and an insulator (Fig.1). The memory state can be controlled by generation and annihilation of a conductive filament in the insulator. By selecting an optimum combination of the active metal and insulator, the shape of the filament can be discontinuous clustered structure, reducing the operation current below mA.

Fig.1: Cross-sectional TEM image and switching mechanism for the Ag ionic memory cell.

Fig.1: Cross-sectional TEM image and switching mechanism for the Ag ionic memory cell.

We successfully fabricated a scaled cross-point array composed of the ionic memory cell (Fig.2) and demonstrated memory properties suitable for terabit-scale high-density applications (Fig.3).

Fig.2: Cross sectional TEM image for the scaled cross-point array composed of the Ag ionic memory cell.

Fig.2: Cross sectional TEM image for the scaled cross-point array composed of the Ag ionic memory cell.

Fig.3: Memory properties for the Ag ionic memory cells with various cell area.

Fig.3: Memory properties for the Ag ionic memory cells with various cell area.

This achievement was presented in the 2019 IEEE VLSI Technology Symposium[2].

[1] Z. Jiang, S. Qin, S. Fujii, D. Lee, S. Wong, and H.-S. P. Wong , “Selector requirements for Tera-bit Ultra-High-Density 3D Vertical RRAM”, 2018 IEEE Symposium on VLSI Technology, pp.107-108.

[2] S. Fujii, R. Ichihara, T. Konno, M. Yamaguchi, H. Seki, H. Tanaka, D. Zhao, Y. Yoshimura, M. Saitoh, and M. Koyama, “Ag Ionic Memory Cell Technology for Terabit-Scale High-Density Application”, 2019 IEEE Symposium on VLSI Technology, pp.T188-T189.

Development of BiCS FLASH™

Flash memories are used in a wide range of information devices and IT industries to store data, including smartphones, game consoles, car navigation systems, and cloud servers. To meet the demand for ever-smaller, higher-capacity storage devices, it is essential to increase the storage density of flash memories. For two-dimensional (2D) NAND flash memories, we have employed nanofabrication and other technologies to develop a 15-nm memory cell, realizing such flash memories. However, geometry scaling is approaching the physical limit. BiCS FLASH™ overcomes the density limit through multilayer cell array stacking. The latest 96-layer BiCS FLASH™ provides a capacity of 512 gigabits with a chip width of roughly 12 mm, much smaller than a one-cent coin, and an approximately 50% higher bit density than the preceding 64-layer BiCS FLASH™. We are currently committed to the development of technology to further increase the number of stacked layers in order to meet the rapidly growing demand for memory capacity driven by the information explosion.

Electron microscopy image of Gen-4 BiCS FLASH™

Electron microscopy image of Gen-4 BiCS FLASH™

roadmap of NAND FLASH

roadmap of NAND FLASH

3D Semicircular Flash Memory Cell (Twin BiCS FLASH): Novel Split-Gate Technology to Boost Bit Density

Three-dimensional (3D) semicircular split-gate flash memory cells have been successfully developed for the first time. Properly designed semicircular Floating Gate (FG) cells achieve superior program slope and program/erase window at much smaller cell size relative to circular Charge Trap (CT) cells. It is projected that the semicircular split-gate FG cell is a promising candidate to realize more than four bits/cell (QLC) for significantly higher memory density at a lower number of stacking layers.

3D flash memory technology has realized high bit density with low cost per bit by increasing the number of cell stacks and implementing multilayer stack deposition and high aspect ratio etch [1]. In recent years, as the number of cell layers is exceeding 100 [2], managing the fundamental trade-offs among etch profile control, size uniformity and productivity is becoming increasingly challenging. This work demonstrates the 3D semicircular split-gate cell technology that attains a considerable reduction in cell size with significant gains in the program slope and program/erase window compared to the conventional circular cell, enabling higher-density memories at the lower number of cell layers (Fig. 1).

Fig. 1: (a) Bit density as a function of the number of cell layers. (b) Schematic plane view of the semicircular cells.

Fig. 1: (a) Bit density as a function of the number of cell layers. (b) Schematic plane view of the semicircular cells.

The circle-shaped control gate provides larger program window with relaxed saturation problems when compared with a flat-shaped gate because of the curvature effect [3], where carrier injection through the tunnel dielectric is enhanced while electron leakage to the block (BLK) dielectric is lowered (Fig. 2). In this split-gate cell design, the circular control gate is symmetrically divided into two semicircular gates to take advantage of the strong improvement in the program/erase dynamics. As shown in (Fig. 3), the conductive storage layer is employed for high charge trapping efficiency in conjunction with the high-k BLK dielectrics, achieving high coupling ratio to gain program window as well as reduced electron leakage from the FG, thus relieving the saturation issue. The FG is completely isolated for each cell, which eliminates charge migration across adjacent cells.

Fig. 2: Simulated results of saturation Vt versus program window.

Fig. 2: Simulated results of saturation Vt versus program window.

Fig. 3: Fabricated semicircular FG cells. (a) Cross-sectional view. (b) Plane view.

Fig. 3: Fabricated semicircular FG cells. (a) Cross-sectional view. (b) Plane view.

The experimental program/erase characteristics in (Fig. 4) reveal that the semicircular FG cells with the high-k-based BLK exhibit higher program slope and larger program/erase window over the larger-sized circular CT cells. Furthermore, careful tailoring of TNL film quality and interface formation processes sufficiently improves endurance and post-cycling data retention up to 10k cycling stress (Fig. 5). The semicircular FG cells, having superior program/erase characteristics, are expected to attain comparably tight QLC Vt distributions at small cell size, and integration of low-trap Si channel is capable of more than four bits/cell, e.g., Penta-Level Cell (PLC) as projected in (Fig. 6). These results confirm that the semicircular FG cell is a viable candidate for advancing the quest to boost bit density.

Fig. 4: Experimental program/erase characteristics comparing the semicircular FG cells with the circular CT cells.

Fig. 4: Experimental program/erase characteristics comparing the semicircular FG cells with the circular CT cells.

Fig. 5: Reliability for 10k P/E cycling stress. (a) Endurance.

Fig. 5: Reliability for 10k P/E cycling stress. (a) Endurance.

Fig. 5: Reliability for 10k P/E cycling stress. (b) Data retention.

Fig. 5: Reliability for 10k P/E cycling stress. (b) Data retention.

Fig. 6: Simulated Vt distributions after programming using calibrated parameters. (a) QLC. (b) PLC.

Fig. 6: Simulated Vt distributions after programming using calibrated parameters. (a) QLC. (b) PLC.

The novel split-gate structure, in which the control gate of the conventional circular cell is divided into the semicircular shape, demonstrates the large number of bits/cell at significantly reduced cell size, therefore providing high bit density with the low number of cell stacks. We will continue to refine the cell design/characteristics and pursue research and development toward practical applications.

[1] H. Tanaka, et al., “Bit cost scalable technology with punch and plug process for ultra high density flash memory”, Symp. VLSI Tech. Dig., p. 14, 2007.

[2] C. Siau, et al., “A 512Gb 3-bit/cell 3D flash memory on 128-wordline-layer with 132MB/s write performance featuring circuit-under-array technology”, ISSCC Tech. Dig., p. 218, 2019.

[3] S. Amoroso, et al., “Semi-analytical model for the transient operation of gate-all-around charge-trap memories”, IEEE Trans. Electron Devices, p. 3116, 2011.

TCAD (Technology CAD) Development

memory devices that require new materials and complex 3D structures.

To start, we establish fundamental models of process phenomena and device operations. We apply computational science such as first-principle calculation for a thorough understanding of electron-level or atomic-level microscopic phenomena.

Then, we promptly build the process and device models into our in-house TCAD system, which realizes robust simulation.

We make great contributions to good prospects and efficient advanced memory development not only by finding solutions to the technical issues with the memories currently under development but also by predicting the performance and possible issues of future generation memories before starting fabrication.

Development flow with TCAD

Development flow with TCAD

New Evaluation Method for Nanomaterials

In order to realize new memory devices, development of nanomaterials (molecules or particles whose size is less than 10nm) is crucially important, but it is extremely difficult to evaluate their electrical properties.

For example, when the top electrode material is deposited on the nanomaterial on the bottom electrode, degradation of the nanomaterial may occur if the heat resistance of the nanomaterial is low, or a short between the top electrode and the bottom electrode may occur if the top electrode material penetrates the nanomaterial. Probing by STM (Scanning Tunneling Microscope) is another evaluation method, but it is very difficult to get good reproducibility.

We have established a brand-new evaluation method for nanomaterials by applying the state-of-the-art semiconductor fabrication process. Firstly, a large number of nanogaps like one in Fig.1, whose space is almost the same size as the nanomaterial, are formed at once with good controllability, and then a nanomaterial is inserted into the nanogap. Figure 2 shows examples of nanomaterials, namely, a gold nanoparticle, fullerene C60, and an oligo-phenylene-ethylene derivative. Figure 3 shows I-V characteristics of nanomaterials in a 5nm or 2nm gap . Very small current, lower than 1pA (p=10-12), can successfully be measured. Figure 4 shows the histogram of the threshold voltage that can flow 0.1pA current, and distributions can be obtained by multi-point measurement.

We will continue to develop new evaluation methods and apply them in the development of new nanomaterials, and promote the development of new functional devices.

Development of New Evaluation Method for Nanomaterials

Development of New Evaluation Method for Nanomaterials

Process technology

Next-Generation Lithography Process: Nanoimprint

In the optical lithography process, shorter wavelengths and higher NAs that increase the lens diameter have been introduced to meet demand for device miniaturization. As wavelength reduction and NA heightening are approaching their physical limitations, new techniques are emerging, such as multiple patterning that repeats optical lithography several times or EUVL (Extreme Ultra-Violet Lithography). However, due to the cost of process step increase and additional process tools, it is inevitable that process costs will increase.

In order to overcome the lithography process cost increase, we are developing nanoimprint lithography that can miniaturize devices at lower cost. The nanoimprint technique uses imprinting to transfer nanoscale patterns on a template to a Si wafer, and unlike conventional lithography tools, it does not require a lens optical system for reduction projection.

The nanoimprint is a highly anticipated next-generation lithography method to realize advanced memory devices with reduced cost.

Nanoimprint lithography

Nanoimprint lithography

Analytical Technologies for Next-Generation Devices

In order to achieve high-performance and high-functional next-generation memory devices, it is essentially required to have (1) device design and process technology for 3D nanostructures, (2) material technologies that can introduce various functional thin films, (3) analysis technology that can reveal device nanostructure and material composition.

As many 3D memory nanostructures consist of intricately stacked thin films, it is very important to accurately understand the nanostructures of individual films, the interfaces between them, and the elemental composition distribution in order to realize high-performance and high-reliability devices. New analytical techniques need to analyze nanometer-level 3D structures, and we are driving various advanced analysis methods to achieve this task.

Specifically, Atom Probe Tomography (APT) can reveal 3D elemental distribution by counting the atoms one by one, as shown in the left figure. The right figure is an example of transistor (MOSFET) elemental analysis that can successfully visualize the 3D profile of elements on the nanometer level.

The principle of the APT (left); an atom map of a transistor (right)

The principle of the APT (left); an atom map of a transistor (right)

Image processing technology utilizing machine learning

State-of-the-art semiconductor manufacturing requires highly accurate defect inspection even if the defects are very small. We are developing a new inspection technique utilizing not only conventional image processing but also machine learning.

The left-hand figure below shows an example of conventional defect inspection in the semiconductor manufacturing process using SEM (Scanning Electron Microscope). Defects such as open or short failure of metal wires on semiconductor wafer are detected by comparing with the CAD layout(*1) of the circuit. But as the pattern transferred on a wafer is not identical to the CAD layout, excess detection of non-defects may occur. We have developed the novel inspection technique shown in the right-hand figure below. We apply machine learning to generate a virtual SEM image from the CAD layout and compare it with the SEM image to get more accurate results(*2). We will continue to introduce advanced machine learning that progresses day by day and develop technologies that contribute to higher yields and higher quality of our products.
 

*1 CAD (Computer Aided Design) drawing for semiconductor IC manufacturing (e.g., wiring )

*2 Joint development with Toshiba Corp.

The result of the defect inspection with the CAD layout (left), and with machine learning (right).

The result of the defect inspection with the CAD layout (left), and with machine learning (right).

RIE Technology Supporting BiCS FLASH™

BiCS FLASH™ incorporates various innovative solutions to minimize the cost increase by changing the memory structure from 2D to 3D. For example, to fabricate BiCS FLASH™ memories, electrode and dielectric layers are alternately stacked all at once, and then holes are punched through all the layers at once, to reduce the number of manufacturing processes. The next step is the simultaneous deposition of dielectric films inside all the through-holes, followed by the formation of electrode columns. This structure causes the intersection of contiguous electrodes to form a memory cell (Figure 1). For these manufacturing processes, plasma etching (RIE*1) technology is crucial in order to form deep memory holes with a uniform diameter. To achieve the optimum hole shape, it is necessary to develop not only new mask materials and etching gases but also shape and plasma control technologies. We are working to further increase the number of layers by leveraging surface and gaseous layer control technologies as well as various simulation technologies.
 

*1 RIE: Reactive Ion Etching

Fig. 1: Formation of BiCS FLASH™ memory cells      Fig. 2: Underlying technologies for plasma etching

Fig. 1: Formation of BiCS FLASH™ memory cells      Fig. 2: Underlying technologies for plasma etching

Development of Monocrystalline Si Channel Process for 3D Flash Memory

In order to increase bit density more on 3D flash memory, we have been studying the technology for highly-stacked 3D flash by increasing the number of vertically stacked word lines. The highly-stacked structures cause a performance degradation due to increased channel resistance and cell Vth variation which is derived grain boundaries in poly-Si. Monocrystalline Si channel is one of the ultimate solutions, and we tried to apply the technology forming macaroni-shaped monocrystalline Si channel in vertical memory holes. As a method of forming monocrystalline Si, we focused on the metal-induced lateral crystallization (MILC), which is a solid-phase crystallization technology using the metal such as nickel silicide studied in Si-TFT (Thin Film Transistor) as the growth edge [1]. By applying MILC technology to Si film in the vertical memory holes, we successfully fabricated the formation of monocrystalline Si from amorphous Si via nickel silicide (Fig. 1).The 3D flash memory cell devices equipped with this technology demonstrated superior electrical characteristics and reduced variation compared to conventional devices using poly-Si as the channel (Fig.2). This achievement was presented in the 2019 IEEE International Electron Devices Meeting [2].

【References】

[1] S.-W. Lee and S.-K. Joo,” Low temperature poly-si thin-film transistor fabrication by metal-induced lateral crystallization”, IEEE Electron Dev. Lett. 17, pp.160-162 (1996)

[2] H. Miyagawa, H. Kusai, R. Takaishi, T. Kawai, Y. Kamimuta, T. Murakami, K. Ariyoshi, T. Asano, M. Goto, M. Fujiwara, Y. Mitani, T.Obu and H. Aochi, “Metal-Assisted Solid-Phase Crystallization Process for Vertical Monocrystalline Si Channel in 3D Flash Memory”, 2019 IEEE International Electron Devices Meeting, pp.650-653

Figure1: Snap shots of in-situ TEM image of metal-assisted solid-phase crystallization (SPC)

Figure1: Snap shots of in-situ TEM image of metal-assisted solid-phase crystallization (SPC)

Figure2: Vg-Icell characteristics for poly-Si channel, and metal-assisted SPC channel

Figure2: Vg-Icell characteristics for poly-Si channel, and metal-assisted SPC channel

14nm Half-pitch Direct Patterning with Nanoimprint Lithography

In order to cope with shrinking of semiconductor device pattern dimension below 30nm half pitch and increasing fabrication cost, we are developing low cost Nanoimprint Lithography (NIL) (Please refer to the article ”Next-generation lithography (NGL) process: Nanoimprint” on this home page). We have fabricated 14 nm half-pitch template using self-aligned double patterning (figure 1) method. Using this template, we have successfully fabricated 14 nm half pitch resist pattern (figure2) as well as Si lines (figure 3) on a 300 mm wafer. Thus, we have been able to demonstrate that the NIL is a promising NGL candidate for future miniaturized devices and has a potential for lowering the fabrication cost by reducing the number of process steps.

Since NIL needs direct contact of the fine pattern template and wafer, increasing the template life i.e., reducing potential damage to the template, improving throughput by promoting faster resist filling into the template features and, achieving high overlay accuracy by controlling template distortion, are the problems to be solved. We have demonstrated, long template life by developing technologies for in-situ particle removal from the template (figure 4), high throughput by developing new resist and gas permeable spin-on-carbon (GP-SOC) for faster resist filling (figure 5) and, high overlay accuracy using technology to control high order distortions (figure 6). We presented these achievements at SPIE Advanced Lithography Conference in San Jose, USA in February 2019 [1].

Reference

[1] T Kono, M. Hatano, H. Tokue, H. Kato, K. Fukuhara, and T. Nakasugi, “Half pitch 14nm direct patterning with Nanoimprint Lithography”, Proceedings of SPIE - The International Society for Optical Engineering, 10958 (2019)

Figure 1: 14nm half pitch template

Figure 1: 14nm half pitch template

Figure 2: 14nm half pitch resist pattern on wafer

Figure 2: 14nm half pitch resist pattern on wafer

Figure 3: 14nm half pitch etched pattern on wafer

Figure 3: 14nm half pitch etched pattern on wafer

Figure 4: Template life trend

Figure 4: Template life trend

Figure 5: NIL throughput trend

Figure 5: NIL throughput trend

Figure 6: NIL overlay accuracy by high-order distortion correction

Figure 6: NIL overlay accuracy by high-order distortion correction

System technology

HMB (Host Memory Buffer) technology for DRAM-less SSDs

Recently, laptop computers are becoming thinner and thinner, and built-in SSDs are required to be smaller in size, as well as lower in cost. But if the DRAM on an SSD is eliminated to reduce the number of SSD parts, it generally degrades the data read/write performance of the SSD.

We have successfully developed HMB (Host Memory Buffer) technology to realize a DRAM-less, high-performance, one-package SSD. HMB technology utilizes part of the host memory (DRAM) as if it were its own, and achieves equivalent performance to an SSD with DRAM.

As cooperation between the host driver and SSD is necessary, we developed HMB protocols for booting and connection, and have them incorporate PCIe® SSD interface standard, NVMe 1.2* with major CPU/OS vendors.

A DRAM-less, high-performance, one-package SSD with HMB technology is now a product of our SSD division, called BG series SSD. It is also one of our main consumer SSD products. We will continue to develop advanced technologies for high-performance, small, and low-cost SSDs.
 

* An interface specification developed for SSDs
  NVMe is a trademark of NVMe Express, Inc. PCIe is registered trademark of PCI-SIG.

Conventional SSD (left) and HMB-SSD (right): HMB-SSD utilizes a part of host DRAM instead of DRAM on SSD.

Conventional SSD (left) and HMB-SSD (right): HMB-SSD utilizes a part of host DRAM instead of DRAM on SSD.

Development of high-speed and high-energy-efficiency algorithm and hardware architecture for deep learning accelerator

We have developed an AI accelerator for deep learning and presented it at an International conference on semiconductor circuits, A-SSCC 2018.→Related Information

Huge numbers of multiply-accumulate (MAC) computations are required for deep learning, but they give rise to long computation time and large power consumption. In order to cope with them, we introduced two new techniques: “filter-wise optimized quantization with variable precision”  (Fig. 1) and “bit parallel MAC hardware architecture.” (Fig.  2)

The filter-wise technique optimizes the number of weight bits for each one of tens or thousands of filters on every layer. If the average bit precision is 3.6bit, the recognition accuracy of layer-wise optimized quantization (Fig. 1 middle) is reduced to less than 50%, but the proposed filter-wise quantization maintains almost the same accuracy as that before quantization with reduced computation time.

The bit serial technique (Fig. 2 left) is often used for MAC architecture, but if it is applied to filter-wise quantization (Fig. 2 middle), the execution time will vary depending on the bit precision of filters. The PE (Processing Element) assigned for the filter whose computation is large may become a bottleneck. The bit parallel technique (Fig. 2 right), on the other hand, divides each various bit precision into a bit one by one and assigns them to several PEs one by one and operates them in parallel. The utilization efficiency of PEs is improved to almost 100% and throughput also becomes higher.

We implemented our algorithm and hardware architecture with ResNet-50(*1) on FPGA(*2) and demonstrated the image recognition test of ImageNet(*3) run with 5.3 times computation throughput and with computation time and energy consumption as low as 18.7%.
 

*1…ResNet-50: One of deep neural network, generally used to benchmark deep-learning for image recognition

*2…FPGA: Field Programmable Gate Array

*3…ImageNet: A large image database, generally used to benchmark image-recognition, the number of image data is over 14,000,000.

Fig. 1: Conventional quantization with fixed 16bit (upper), layer-wise quantization (middle), proposed filter-wise quantization (bottom).

Fig. 1: Conventional quantization with fixed 16bit (upper), layer-wise quantization (middle), proposed filter-wise quantization (bottom).

Fig.2: Layer-wise quantization and bit serial architecture (left), filter-wise quantization and bit serial architecture (middle), and filter-wise quantization and bit parallel architecture (right).

Fig.2: Layer-wise quantization and bit serial architecture (left), filter-wise quantization and bit serial architecture (middle), and filter-wise quantization and bit parallel architecture (right).

A 25.6Gb/s Interface with Ring Topology for High-Bandwidth and Large-Capacity Storage Systems

High-speed and large-capacity storages are increasingly required by various big-data applications such as medical and financial areas. In order to achieve a large-capacity storage using NAND Flash memory and BiCS FLASH™(hereinafter, referred to as NAND), many NAND packages have to be connected to a controller. However, the operation speed is degraded by large load capacitance due to the many NAND packages on a single channel as shown in Figure 1(a). On the other hand, if many channels are used to reduce the load capacitance per channel, the number of high-speed signal lines will increase and this will complicate the printed circuit board layout near the controller as shown in Figure 1(b).

To solve these issue, we have proposed a daisy-chain configuration (*1) using bridge chips, to achieve high-speed operation and large capacity with fewer signal lines as shown in Figure 1(c) [1,2]. To reduce chip area and power consumption of the bridge chip, three techniques have been developed [2]. The first one is a technique to reduce the number of transceivers on the bridge chip from two to one by using ring–topology connection between controller and the bridge chips. The second one is a technique to lower the operation-speed in the bridge chips’ circuits in order to relax their required performance by using PAM4 (*2) serial communications compared to conventional NRZ (*3) which utilizes only binary signal of 0 or 1. The final one is a technique to reduce chip area and power consumption by eliminating a PLL (*4) circuit on the bridge chip using a new cascaded CDR (*5) circuits with improved jitter (*6) characteristics.

The die micrograph of the fabricated chip is shown in Figure 2(a). The bridge chip was fabricated in a 28nm CMOS process and it can act as both bridge chip and controller. The evaluations were performed connecting four bridge chips and two controllers as shown in Figure 2(b) and (c).

Figure 3(a) shows PAM4 signal output from the controller C measured by an oscilloscope, and a good PAM4 waveform is observed. This PAM4 signal is received by the final controller C’ via four bridge chips B0 – B3. Figure 3(b) shows an eye diagram (*7) observed utilizing on-chip eye monitor on the final controller C’ and clear three eye openings for each cycle waveform are obtained. Figure 3(c) shows measured bit error rate (BER) of the bridge chip B0 and final controller C’, and sufficiently low BER less than 10-12 is obtained for both cases. By similar measurements for all chips in the daisy chain, we confirmed 25.6-Gbps PAM4 communications with sufficient performance for all bridge chips and controllers. These results demonstrate a possibility of achieving high-speed and large-capacity storage using the bridge chip and were presented at the International Solid-State Circuits Conference 2019 (ISSCC 2019) in San-Francisco in February 2019 [2].

*1…Daisy-chain configuration: A configuration of propagating signals along multiple devices connected in series.

*2…PAM4: 4-Level Pulse Amplitude Modulation

*3…NRZ(Non-return-to-zero):A binary code which does not return to zero volt for each bit

*4…PLL: Phase Locked Loop, a circuit to generate a reference signal

*5…CDR: Clock Data Recovery, a method to recover a clock signal from a received data signal

*6…Jitter: Fluctuation in the time domain of the clock or signal waveforms

*7…Eye Diagram: A graphically superimposed signal waveforms which are repetitively sampled

【References】

[1] Y. Tsubouchi, D. Miyashita, Y. Satoh, T. Toi, F. Tachibana, M. Morimoto, J. Wadatsumi, and J. Deguchi, “A 12.8 Gb/s Daisy Chain-Based Downlink I/F Employing Spectrally Compressed Multi-Band Multiplexing for High-Bandwidth and Large-Capacity Storage Systems,” 2018 Symposium on VLSI Circuits, pp. 149-150 (2018)

[2] T. Toi, J. Wadatsumi, H. Kobayashi, Y. Shimizu, Y. Satoh, M. Morimoto, R. Ito, M. Ashida, Y. Tsubouchi, M. Nozawa, G. Urakawa, and J. Deguchi, "A 25.6Gb/s Uplink-Downlink Interface Employing PAM-4-Based 4-Channel Multiplexing and Cascaded CDR Circuits in Ring Topology for High-Bandwidth and Large-Capacity Storage Systems", 2019 IEEE International Solid - State Circuits Conference - (ISSCC), pp. 478-480 (2019)

Figure1: Connection between controller and NAND*, (a) using one channel: operation speed is degraded due to large capacitive load of many NAND*, (b) using many channel: Layout design near controller is complicated due to too many signal lines, (c) using daisy-chain configuration by bridge chip.

Figure1: Connection between controller and NAND*, (a) using one channel: operation speed is degraded due to large capacitive load of many NAND*, (b) using many channel: Layout design near controller is complicated due to too many signal lines, (c) using daisy-chain configuration by bridge chip.

*…NAND Flash Memory or BiCS FLASH™

Figure2: Measured setup, (a) die photograph of bridge chip, (b) configuration for measurement, bridge chip can be operated as controller in measurements, (c) stacked evaluation boards.

Figure2: Measured setup, (a) die photograph of bridge chip, (b) configuration for measurement, bridge chip can be operated as controller in measurements, (c) stacked evaluation boards.

Figure3: Measured results, (a) output PAM4 signal of transmitter in controller, (b) clear opened eye at receiver in controller, (c) BER at receiver in controller, BER of less than 10^-12 is obtained.

Figure3: Measured results, (a) output PAM4 signal of transmitter in controller, (b) clear opened eye at receiver in controller, (c) BER at receiver in controller, BER of less than 10-12 is obtained.

Production management technology

Factory Innovation

As a result of an increase in the capacity of memory products, the amount of data handled at a factory is growing considerably. Unlike automobiles, flash memories are manufactured using a complicated network of more than 5,000 manufacturing and inspection equipments. To maintain high quality, more than two billion data items are collected every day in real time from manufacturing equipments and transport systems. Complicated factory analyses are performed using such an enormous amount of data. For example, deep learning technologies help to greatly reduce the percentage of devices rejected by a defect test and AI technologies help reduce the time required to infer the cause of defects. In addition to Yokkaichi Plant, KIOXIA Corporation is currently constructing a memory fab in the city of Kitakami, Iwate Prefecture. We are introducing state-of-the-art tools and promoting open innovation both within and outside the company with the aim of achieving efficient production at two sites.

Example of big-data utilization at Yokkaichi Plant

Example of big-data utilization at Yokkaichi Plant

PAGETOP