HTML
-
With the rapid development of artificial intelligence (AI), efficient information sensing and processing are highly desired for real-time interaction with dynamic environments [1, 2]. In conventional perception systems, multiple sensors perceive sensory information from the surrounding environment, the resulting raw information is stored by memory units, and is subsequently analyzed by a processing unit to conduct complex processing tasks. The processes of analogue-to-digital conversion (ADC) and data transmission among different units will result in enormous redundant data and power consumption, reducing the overall efficiency of the artificial perceptual system [3-5]. Furthermore, cross-modal perception of different sensory cues suffers from the sequential nature of the operations that are implemented by indispensable circuits and software [6, 7]. Differently from these artificial implementations, human perception supports the integration and interaction of different senses such as vision, hearing, smell, taste, and touch. This ability for multimodal perception makes it possible for biological systems to adapt to their complex surroundings [8, 9]. To efficiently integrate and analyze information from dynamic environments and optimize resulting decisions, it is critically important to carry out reliable evaluation and weight adjustment of different sources of sensory information. For example, sound detection can guide visual attention to blind areas for the purpose of avoiding danger. Meanwhile, visual-haptic coordination improves the accuracy of perceptual judgments and actions, such as grasping objects or comparing textures [10]. Inspired by the human perceptual system, the development of electronic counterparts with multimodal integration may provide a suitable platform for intelligent interaction.
Memristors are emerging neuromorphic devices with ultrafast speed, low energy consumption, and high density. They are potentially applicable to various fields, including information storage, logic operation, and neuromorphic computing [11-15]. Thanks to their structural and functional similarity with biological synapses, memristors carry great promise for the construction of artificial perceptual systems [16-20]. Within memristive devices, the migration and diffusion of ions/electrons can be coordinated and modulated using multi-dimensional stimuli such as light, heat, and pressure [21-25]. The internal dynamical processes generated by multidimensional signals are regarded as fundamental components of effective multimodal sensing. Ohm’s law and Kirchhoff’s law support not only data sensing and storage via resistance change, but also the functionality of information processing. The novel architecture of memristors carries great potential for addressing the limitations imposed by von Neumann’s bottleneck [13, 17]. Recently, several research groups have developed multimodal integrated memristors by utilizing structural integration with sensor or materials optimization [8, 26-29]. The resulting near-sensor and in-sensor computing architectures enable elimination of the frequent transfer and conversion between sensing and processing units. Long data delays and high power consumption can be avoided with multifunctional integration. Compared with traditional digital sensing, near-sensor/in-sensor computing provides an effective method for implementing feature extraction and identification of unstructured data [30-34]. By exploiting the above principles and technical advances, accurate object recognition and decision-making have been realized with multimodal fusion of sensory information.
In this review, we focus on recent progress in multimodal memristor technology and its application to bio-realistic neuromorphic perception. First, we provide an overview of the research background behind multimodal perceptual systems based on memristors. Then recent advances in memristor-based multimodal perception are introduced, including vision, touch, olfaction, and audition. Furthermore, we provide a detailed discussion and understanding of the mechanisms and functions associated with artificial perceptual systems. We also outline the principles and mechanisms underlying integrated processing of multimodal data for more accurate recognition and decision-making. Finally, we summarize existing challenges hindering the development of multimodal memristors and present a short discussion on future prospects.
-
Human multimodal perception integrates different sources of sensory information to make accurate judgments about object properties [35, 36]. In conventional artificial perceptual systems, the two processes of ADC and serial processing result in high power consumption and circuit complexity [37, 38]. Inspired by biological perception, neuromorphic systems based on multimodal memristors support efficient information integration and high fault-tolerance [39-41]. Recently, neuromorphic perceptual systems with visual, tactile, auditory, and olfactory sensation have been successfully demonstrated [37, 42-48] (see figure 1). These research advances will be discussed in detail in the following section.
-
Traditional machine vision has been widely applied in various fields, including healthcare, security and manufacturing. However, the separated hardware comprising sensing, storage and processing unit would generate a large amount of redundant data and high power consumption. Inspired by the human vision, neuromorphic visual system with near-/in-sensor computing architecture has been developed for efficient image perception [43, 49, 50]. Optoelectronic memristive devices integrate the functionalities of sensing and processing, which present significant potential for application to neuromorphic vision [33, 51-54]. In recent years, artificial visual systems based on optoelectronic memristors have attracted extensive attention, and significant advances have been made in this field.
Generally, a color-mixed pattern contains ten times more information than the monochrome one, supporting superior discrimination of image detail [55]. Several approaches have been proposed to implement color-mixed pattern recognition. For instance, Seo et al developed an optic-neural synaptic (ONS) device based on h-BN/WSe2 heterojunction, as demonstrated in figure 2(a) [42]. In this application, the ONS device hosts both optical sensing and synaptic devices on the same van der Waals heterojunction. Optical signals with shorter wavelength decrease the resistance of the sensing device more effectively than longer wavelength, resulting in more carriers at the level of the weight control layer. As shown in figure 2(b), the ONS device exhibits long-term potentiation/depression within various conductance regions under the irradiation of different light sources, with shorter wavelengths inducing a more significant increase in conductance. Via integration of optical sensing and synaptic device, the system can perform color-mixed pattern recognition (see figure 2(c)). A 28 28 device array was constructed, with each cell group consisting of three neurons responding to different optical signals of red (R), green (G), and blue (B) light. In figure 2(c), colored numbers 1 and 4 were selected as training datasets, while the mixed-color digits were selected as test images. The comparison results show that the optic-neural network achieved a recognition rate >90%, against <40% for conventional neural networks. The integration of optical sensing and synaptic dynamic properties carries promising potential for implementing complex visual functions.
The human retina performs image information sensing and real-time preprocessing at the same time. Developing retinomorphic devices is beneficial for improving image feature detection and improving visual perception. Zhou et al developed optoelectronic resistive random access memory (ORRAM) with the structure of Pd/MoOx/ITO [43] (see figure 2(e)). The non-volatile switching characteristics of this device can be attributed to the valence change of Mo ions. Under the irradiation of ultraviolet light, the photogenerated electrons and protons change the Mo6+ to Mo5+, which results in resistance variation. At the same time, the resulting electric field induces the drift of protons and Mo ions return to Mo6+ [56]. Furthermore, the ORRAMs exhibit light-dosage-dependent resistance states, serving as foundational elements for image preprocessing. At the level of these elements, high optical intensity induces significant amplitude changes of photocurrent and prolonged retention time. Based on nonlinear accumulation effects induced by optical intensity, image contrast enhancement has been realized in an 8 8 ORRAM array. As shown in figure 2(f), the output current image present enhanced features compared with the corresponding input optical image. The preprocessing functions can effectively improve the recognition rate of subsequent processing tasks.
Compared with neuromorphic devices with optical-electric hybrid operation, all-optically modulated devices can operate without complex operations [57-59]. Exploration of novel optoelectronic memristors with reversible optical modulation is still cutting-edge research in neuromorphic computing. Hu et al developed all-optically controlled memristors based on bilayered OD-IGZO/OR-IGZO [60] (see figure 3(a)). Their device supports optical SET and RESET processes under the action of blue and near-infrared light pulses, respectively. Figure 3(b) shows an example of continuous increase and decrease in conductance. The underlying mechanism is related to the competitive relationship between ionization and neutralization of oxygen vacancy. During the optical SET process, blue light induces the transformation from VO to VO2+ compounds, causing a decrease in carrier width. During the optical RESET process, the neutralization of VO compounds becomes dominant and device conductance decreases. By exploiting the all-optical modulation outlined above, spike-timing-dependent plasticity is emulated with blue and near-infrared light pulses as shown in figure 3(c). The single blue light pulse and ten near-infrared light pulses are selected as pre- and postsynaptic spikes, respectively. When the presynaptic spike arrives first (t > 0), the synaptic weight increases; otherwise, the synaptic weight decreases (t < 0).
Besides the emulation of basic synaptic functions, Shan et al developed an efficient neuromorphic visual system using a novel plasmonic optoelectronic memristor [61]. The Au and FTO were selected as top and bottom electrodes, while the functional layer consisted of silver nanoparticles and porous TiO2 film (figure 3(d)). This memristive device exhibits long-term potentiation and depression under illumination from visible and ultraviolet light, respectively. These response properties rely on localized surface plasmon resonance and optical excitation within the Ag-TiO2 nanocomposite film. All-optical modulation implements pre-processing functions, including contrast enhancement and noise depression. Furthermore, the photo-induced oxidation/reduction of Ag nanoparticles produces an impact on subsequent migration under electrical field, i.e. light-gated synaptic modification [62]. High-level image processing in visual cortex has been emulated via electrically driven memristive behaviors. Therefore, the novel optoelectronic memristor combines multiple functions, from visual sensing to low-level preprocessing and high-level image processing. Multifunctional integration improves recognition accuracy and overall performance of the visual system. As shown in figure 3(h), recognition rates reach up to 98% after 300 learning epochs.
In conclusion, optoelectronic memristors with attractive characteristics have been developed through material optimization and structure design. As for memristive materials, the low-dimensional system exhibits high photogeneration efficient and structural anisotropy for novel optoelectronic applications. For example, the anisotropic materials, such as ReS2, ReSe2, single-wall carbon nanotubes and black phosphorus, can be exploited for polarization-perceptual devices, which enables reconfigurable sensory adaption [57, 63-65]. The organic semiconductor materials have also been investigated widely because of advantages in mechanically flexible and simple solution processing techniques [66]. Additionally, oxide material is considered as the promising candidate to construct high density integrated architecture due to the outstanding advantages including excellent stability and compatibility with traditional CMOS [67]. Furthermore, it is a feasible strategy to compound various photosensitive materials to integrate several light-matter interaction process and achieve desired photoresponse behaviors. Neuromorphic visual perception system with high bandwidth and low crosstalk is expected to be developed by utilizing the emerging optoelectronic memristor.
-
In biological tactile systems, mechanoreceptors below the skin surface collect pressure signals in real time, and the collected pressure data are then transferred to tactile cortex for subsequent processing. Corresponding tactile memories can provide guideline for physical movement, and for adequate response to environmental events. Inspired by biological tactile systems, artificial tactile perceptual systems are expected to achieve efficient human-robot interaction. Recently, Zhu et al developed a spiking neuron for multimodal tactile perception that senses pressure and temperature signals [68]. As shown in figure 4(b), the artificial neuron integrates a NbOx-based threshold switching (TS) memristor with a pressure sensor. Figure 4(a) shows current-voltage curves for the TS memristor, which exhibits negligible cycle-to-cycle fluctuation. Besides that, the NbOx-based device possesses thermal sensing characteristics, whereby threshold voltage decreases with increasing temperature. In the neuron circuit, the capacitor charges under the action of constant input voltage. When the output voltage on node 2 surpasses the threshold voltage, the TS memristor switches on and the capacitor discharges, producing a spike signal. In the context of this process, the pressure sensor acts as a variable resistance and pressure intensity modulates spike output frequency. As shown in figure 4(c), increased temperature results in increased output frequency and reduced amplitude. Furthermore, the artificial neuron array can perform thermally enhanced pattern recognition (see figure 4(d)). The combined frequency reflects both pressure and temperature information. Thermalization increases the output frequency of noiseless pixels, while leaving noisy pixels unchanged. This thermally induced enhancement in frequency difference improves the classification ability of the whole system.
The development of tactile near-sensor and in-sensor computing systems is expected to reduce data redundancy and signal conversion. Jiang et al proposed a high-resolution pressure piezo-memory system by integrating a piezo-nanowire array with a memristor array [48]. A piezoelectric field can be induced within the ZnO piezo-nanowire via uniaxial compression force and flow of free electrons, which generate a piezo-potential. As for the memristor device with Li/Al/MoO3/Au structure, digital resistive switching behavior has been realized via electrochemical doping of the MoO3 layer. Figure 5(a) shows a schematic diagram of the piezo-memory pixel. The piezo-potential of the ZnO nanowire promotes migration of Li ions, and results in valence state switching of MoO3 (MoO3 + ne- + nLi+ LinMoO3). The above process increases the conductance of the MoO3 layer, restoring the non-volatile state to its initial state with reset voltage (figure 5(b)). Figure 5(c) shows corresponding force-set and electrical-reset behaviors in pulse-mode. Force-image preprocessing and recognition have been performed with artificial neural networks based on high-resolution piezo-memory devices (see figure 5(d)). The preprocessing function can enhance feature information and smooth background noise, thus improving the efficiency of image recognition.
Wang et al developed a tactile near-sensor analogue computing system by integrating a pressure sensor with a memristor [37] (see figures 6(a) and (b)). The piezoresistive sensor contains PDMS elastomers and silver nanowires, while the structure of the flexible memristor is Au/TiW/HfO2/Au. The HfOx-based memristor exhibits bipolar resistive switching behaviors under voltage sweep (figure 6(c)). Furthermore, the piezoresistive pressure sensor exhibits ultrasensitive characteristics with an ON/OFF ratio of 108, thus modulating the bias applied to the memristor. As demonstrated in figure 6(d), a 3 3 tactile sensor array can be constructed to detect a fingerprint-like mold. The summed current can be read out as monitoring voltage with the monitoring resistor (R0). Figure 6(e) shows that the monitoring voltage ranged between 0 V and 0.05 V as a function of different patterns. Furthermore, Wang et al utilized raster scanning to process the overall fingerprint pattern with 40 40 pixels. The binary experimental result is consistent with the original pattern (figure 5(f)). As mentioned above, integrating state-of-the-art tactile sensor and synaptic devices is the most common implementation to construct artificial tactile perception system. During the tactile perception process, temporal/spatial resolution and robustness are dominated facts for high recognition rate. Besides that, ideal tactile perception units with in-sensor computing architecture is highly desired, which enable to realize the in-situ storage and processing of press signals. In the future, it will be of great significance to optimize the hardware structure of tactile perception unit to achieve large-scale and high-density integration.
-
The atmospheric environment affects human daily life and production activities. Human olfactory perception plays a significant role in a number of areas, including liquor preparation, determination of food freshness, and auxiliary diagnosis of diseases [69-71]. Inspired by the human olfactory system, artificial olfactory perceptual systems could replace expert staff and meet huge social demand. Wang et al developed an artificial olfactory interference system with memristive devices to classify gases. Their system combines a reservoir computing system based on a W/WO3/PEDPT:PSS/Pt memristor with a three-layer neural network with a Pd/W/WO3/Pd memristor [46] (figure 7(b)). During the encoding process, the memristive device transforms the response of the gas sensor into spike trains. Figure 7(a) shows response curves of carbon monoxide and methane for sensor TGS2610. Spike trains are activated when the defined response speed is above threshold. Moreover, a reservoir computing system with 16W/WO3/PEDOT:PSS/Pt devices is constructed to extract gas features from spike trains. Figure 7(c) plots the corresponding temporal responses, in which each spike train leads to deterministic responses. Each memristive device exhibits 20 internal values, and a 16 20 matrix can be obtained with 16 cells. Figure 7(d) demonstrates the visualized graph, where each pixel represents device conductance at a given moment. Wang et al also proposed that sensor characters and sampling duration should be decreased to reduce the complexity of the reservoir computing system. Corresponding results (figures 7(f) and (g)) show that a classification accuracy of 91% can be achieved with a 16 20 input matrix. For comparison, figure 7(f) shows the testing accuracy obtained with different sampling durations, demonstrating that the middle period sampling duration can reduce information redundancy and achieve higher accuracy. Finally, figure 7(g) shows the results obtained after applying spatial and temporal simplification, demonstrating that the middle period sampling duration with 8 10 matrix is optimal.
Song et al developed artificial organ-damage memory system with one single organic transistor [72]. As for the organic transistor, poly[4-(4,4-dihexadecyl-4H-cyclopenta[1,2-b:5,4-b]dithiophen-2-yl)-alt-[1,2,5]thiadiazolo[3,4-c]pyridine] is selected as sensing and storage film and NO2 is set as input signal. The NO2 molecules withdraw electrons and induce the accumulation of hole carriers, thus increasing the sensing current. The organic transistor exhibits dynamic sensing current under the action of different NO2 concentrations. The current is enhanced by more serious organ damage under high NO2 concentration in the atmosphere. Similarly, prolonged NO2 exposure induces adverse effects on human organs. After external stimuli are removed, the prolonged decay process can achieve a cumulative effect for current, which resembles cumulative organ damage. Furthermore, the organic bionic device exhibits excellent flexible and conformable characteristics. This ultraflexible transistor can conform to various objects, including mask and smart band. The excellent flexibility and sensing capability of this device demonstrate its potential for wearable health monitoring. Besides that, Qian et al fabricated an oxygen-detecting synaptic device with tri-layered organic double heterojunction (figure 8(a)) [47]. Excitatory and inhibitory responses were achieved in oxygen-rich and oxygen-free environments, respectively (figure 8(b)). The interconversion between various modes can be attributed to the modulation of majority charge carrier induced by oxygen molecules. The negative feedback process of oxygen homeostasis is successfully emulated by the oxygen-sensitive device. Environmental oxygen concentration is reflected by synaptic current within the device, which results in logic values for controlling oxygen supply rate. Figure 8(c) shows real-time synaptic currents associated different oxygen environments. In oxygen-free environments, the oxygen-sensitive device exhibits negligible response under electrical pulse stimulation (+18 V), but the current increases with increasing oxygen concentration (from 4% to 17%). For the fatal environment of 4% oxygen concentration, the response current is below the oxygen supply valve (50 nA) and necessitates opening of the oxygen supply valve. The oxygen supply valve closes when the current exceeds 100 nA. Oxygen concentration remains between 10% and 17% thanks to the negative feedback process.
At present, developing artificial olfactory perception system with high sensitivity and fast response speed is still the research focus in the field. It is necessary to investigate the influence of various atmospheric environment on the dynamic process in memristive device, so that integrate the sensing, storage and processing of gas signals in single device. Especially, combinating gas sensitive materials with different response characteristics is expected to realize the recognition of multiple gas signals, which can improve the detection accuracy and range.
-
The human auditory system can localize sounds and guide visual attention to specific sound sources, thus collecting important acoustic information about the surrounding environment [73]. The development of artificial auditory perceptual systems represents a feasible strategy for achieving effective human-machine interaction. Sun et al demonstrated sound localization with interaural time difference in an MoS2-based device [44]. Continuously tunable short-term synaptic plasticity was demonstrated by utilizing resistive heating effect on MoS2 channel (figures 9(a)-(c)). Figure 9(d) shows the corresponding mechanisms for sound location, including coincidence detection by interaural time differences and interaural level differences. To suppress interference from interaural level differences, Sun et al encode information about interaural time difference to achieve sound localization. As shown in figure 9(e), the ipsilateral cochlear nucleus converts sound signals into high-frequency spikes, resulting in more obvious PPF compared with the contralateral nucleus. Furthermore, synaptic computation and inhibitory synapses eliminate confounding effects from cues associated with interaural level differences. The final current output shows that only information about interaural time difference is conveyed for sound location (figures 9(f)-(g)). In addition to the above, Wang et al demonstrated spatiotemporal cognitive functions in a spiking neural network with memristive synapses [74]. A simple neural network consisting of field-effect transistors and memristors is constructed to achieved time-dependent control of synaptic weight. In this system, the positive time/amplitude correlation of PRE spikes results in larger response within POST internal voltage. The positive correlation with specific true’ sequences can be used to recognize corresponding spatiotemporal patterns. To avoid the false silence of true patterns, a positive spike VTE is applied to achieve potentiation behavior. During this process, the shorter delay between PRE and POST spikes leads to stronger potentiation. Time-dependent potentiation and depression are demonstrated with HfO2-based synaptic devices. Furthermore, the detection of sound location is emulated with precise timing detection in the spiking neural network. The interaural time difference between left and right ears provides essential information for sound azimuth identification. Herein, the POST internal voltage Vint is the function of interaural time difference, demonstrating that sound azimuth can be estimated from changes of Vint.
In addition to localization of static sound sources, the human auditory system supports motion perception. When analyzing moving sound sources, the Doppler effect is the preferred cue for estimating target velocity. Zeng et al have demonstrated auditory motion perception with Doppler frequency-shift in a WOx-based memristor [45]. Dynamic high-pass filtering and firing operation have been implemented via spike-rate-dependent plasticity (figure 10(a)). The firing operation with specific comparator (threshold current of 0.4
A) converts the spike train to a single pulse, providing the foundation for detecting interaural time differences. Figure 10(b) shows the experimental conditions of firing and non-firing operation. Based on the above principles, azimuth detection has been demonstrated with relative timing-dependent discrimination. Furthermore, Zeng et al achieved successful velocity estimation with Doppler effect in the WOx-based device after activation operation. Figure 10(c) shows a schematical illustration of the Doppler effect and underlying design principle, in which triplet-STDP is employed to process spatiotemporal information. Figure 10(d) shows a conductance change dependent on spike frequency (delay time of 50 ms). Additionally, the range of conductance change can be extended by modulating delay time (figure 10(e)). Using the Doppler velocimeter, the relative relation between velocity of moving source and sound velocity has been achieved in figure 10(f). In conclusion, suitable materials and delicate structures are indispensable for the perception of sound signals. Piezoelectric and triboelectric materials have been considered as promising candidate for auditory perception system [75]. Generally, the analyzing frequency composition is the key procedure to interpret auditory signals [76]. Constructing memristive device with broad frequency detection range will promote the development of efficient auditory perception system.
-
The supramodal perceptual capabilities of biological systems interpret multimodal sensory data to avoid inaccurate judgments about the properties of complex environments [77-79]. Inspired by human supramodal perception, multimodal artificial perceptual systems integrate the senses of vision, hearing, touch, and smell, thus serving as a promising platform for intelligent interaction. Far more than simple information sensing, multimodal systems based on neuromorphic devices fuse various cues efficiently, and achieve high-level cognitive functionalities. Yu et al developed mechano-photonic artificial synapses by integrating optoelectronic transistors with triboelectric nanogenerators (TENG) [80]. The graphene/MoS2 heterojunction in the optoelectronic transistor induces persistent photoconductivity caused by electrostatic band bending. At the same time, the triboelectric potential by TENG in contact-separation mode can effectively modulate optoelectronic synaptic plasticity. The drain current decreases significantly with increased illumination and TENG displacement. Inspired by the retina, Yu et al simulated a multilayer perception-based artificial neural network for image recognition. The results of supervised learning with various displacements demonstrated that mechanical plasticization improves recognition accuracy.
Wang et al developed a multimodal memristor based on MXene-ZnO and emulated visual perception with different adaptability to humidity [81]. Figure 11(a) shows the schematic illustration of device structure, in which PDMS and MXene-ZnO were selected as flexible substrate and resistive layer, respectively. The -OH bands terminated MXene is essential for sensing humidity, while ZnO nanoparticles exhibit photoresponsive behaviors. Figure 11(b) shows corresponding photon-mediated resistive switching behaviors. Ultraviolet illumination (365 nm) decreases set voltage and high resistance state. Figure 11(c) shows I-V curves under various relative humidity conditions. As relative humidity increases, protons are electrostatically attracted to the oxygen vacancies and hinder the formation of conductive filaments. Using this principle, humidity-adapted neuromorphic visual perception is achieved by the MXene-ZnO device (figure 11(d)). Sensing and preprocessing of raw visual information is performed with the MXene-ZnO memristor, and subsequent recognition is implemented by a three-layer artificial neural network. Figure 11(e) shows recognition accuracy under various relative humidity levels, demonstrating environment-adaptable image processing. By exploiting the multimodal device, background noise reduction and enhancement of intrinsic features are achieved by the artificial visual system.
2.1. Artificial system for visual perception
2.2. Artificial system for tactile perception
2.3. Artificial system for olfactory perception
2.4. Artificial system for auditory perception
2.5. Artificial system for multimodal perception
-
In summary, the multimode memristor combines the capabilities of sensing and neuromorphic computing, which can establish promising paradigms to develop perceptual systems with high efficiency. Herein, we introduce various multimodal perceptual systems based on memristor technology, including visual, tactile, olfactory, and auditory senses. The research advances in device structure, material system, operation mode and memristive mechanism have been reviewed. Besides that, several neuromorphic applications have also been presented, including color-mixed pattern recognition, fingerprint identification, and auditory motion perception. Furthermore, pattern recognition with the multimodal fusion of sensory information has been introduced and discussed. The final section summarizes the current challenges faced by multimodal perception systems and provides prospects for their future development.
-
Compared with traditional sensors, multimodal memristors are able to integrate and process multimodal signals, which is beneficial for achieving comprehensive judgments. However, some challenges hinder the future development of multimodal memristors for cross-modal integration and perception. Firstly, current multimodal perceptual systems usually require the integration of multiple sensing units or stacking of different functional layers. The limited processing compatibility between each functional layer reduces system integration. The optimization of material selection and device design is expected to achieve multimodal perception in a single device unit. For example, stable materials with mixed ionic-electronic conduction ability have been considered as promising candidates for multimodal functions. The analog-to-digital conversion can also be avoided in the in-sensor architecture, which significantly improves system efficiency. Therefore,a multimodal perception system with ultra-low power consumption and high-density 3D integration is expected to be realized. Furthermore, it is of great significance to process larger-sized wafers for more comprehensive perception systems. The device yield, device-to-device uniformity and endurance life are key indicators for the high-integrated array. Secondly, it is necessary to reduce the dynamic impacts of changing environments in real time. For this issue, high signal-to-noise ratio can effectively improve the overall accuracy of the system. During multimodal perception, reliance on unimodal sensory information may cause uncertainty and serious perceptual misjudgments in practice. Therefore, the matching and concordance of multimodal signals is of great significance to achieve accurate judgments. It is highly desirable to establish standard guidelines for the analysis of multimodal sensory information. Inspired by biological perception, encoding stimulus signals into pulse train rather than amplitude change may allow for more efficient information processing. In the next stage, a comprehensive investigation of biological organs and the nervous system is indispensable for the development of more advanced perception systems. Thirdly, most investigations focus on the perception of multisensory signals within the system. Subsequent driving systems should also be developed to demonstrate a closed loop perception-control system. In these applications, biohybrid neuromuscular junctions are indispensable for transmitting signals and controlling biological tissue. At the same time, the biocompatibility and impedance matching of memristive devices cannot be neglected. In the near future, optical interconnection may be a common means to control driving systems.
Although multimodal memristors are still in the early stages of research efforts to implement cross-modal perception, they nevertheless carry promising potential for efficient analysis of complex information. It is anticipated that the above challenges will be addressed by ongoing progress in the field of memristive materials, device fabrication, and electrical system development. We expect the emergence of memristor-based multimodal perception systems with high performance in the coming years.