- •Preface
- •Contents
- •Contributors
- •Modeling Meaning Associated with Documental Entities: Introducing the Brussels Quantum Approach
- •1 Introduction
- •2 The Double-Slit Experiment
- •3 Interrogative Processes
- •4 Modeling the QWeb
- •5 Adding Context
- •6 Conclusion
- •Appendix 1: Interference Plus Context Effects
- •Appendix 2: Meaning Bond
- •References
- •1 Introduction
- •2 Bell Test in the Problem of Cognitive Semantic Information Retrieval
- •2.1 Bell Inequality and Its Interpretation
- •2.2 Bell Test in Semantic Retrieving
- •3 Results
- •References
- •1 Introduction
- •2 Basics of Quantum Probability Theory
- •3 Steps to Build an HSM Model
- •3.1 How to Determine the Compatibility Relations
- •3.2 How to Determine the Dimension
- •3.5 Compute the Choice Probabilities
- •3.6 Estimate Model Parameters, Compare and Test Models
- •4 Computer Programs
- •5 Concluding Comments
- •References
- •Basics of Quantum Theory for Quantum-Like Modeling Information Retrieval
- •1 Introduction
- •3 Quantum Mathematics
- •3.1 Hermitian Operators in Hilbert Space
- •3.2 Pure and Mixed States: Normalized Vectors and Density Operators
- •4 Quantum Mechanics: Postulates
- •5 Compatible and Incompatible Observables
- •5.1 Post-Measurement State From the Projection Postulate
- •6 Interpretations of Quantum Mechanics
- •6.1 Ensemble and Individual Interpretations
- •6.2 Information Interpretations
- •7 Quantum Conditional (Transition) Probability
- •9 Formula of Total Probability with the Interference Term
- •9.1 Växjö (Realist Ensemble Contextual) Interpretation of Quantum Mechanics
- •10 Quantum Logic
- •11 Space of Square Integrable Functions as a State Space
- •12 Operation of Tensor Product
- •14 Qubit
- •15 Entanglement
- •References
- •1 Introduction
- •2 Background
- •2.1 Distributional Hypothesis
- •2.2 A Brief History of Word Embedding
- •3 Applications of Word Embedding
- •3.1 Word-Level Applications
- •3.2 Sentence-Level Application
- •3.3 Sentence-Pair Level Application
- •3.4 Seq2seq Application
- •3.5 Evaluation
- •4 Reconsidering Word Embedding
- •4.1 Limitations
- •4.2 Trends
- •4.4 Towards Dynamic Word Embedding
- •5 Conclusion
- •References
- •1 Introduction
- •2 Motivating Example: Car Dealership
- •3 Modelling Elementary Data Types
- •3.1 Orthogonal Data Types
- •3.2 Non-orthogonal Data Types
- •4 Data Type Construction
- •5 Quantum-Based Data Type Constructors
- •5.1 Tuple Data Type Constructor
- •5.2 Set Data Type Constructor
- •6 Conclusion
- •References
- •Incorporating Weights into a Quantum-Logic-Based Query Language
- •1 Introduction
- •2 A Motivating Example
- •5 Logic-Based Weighting
- •6 Related Work
- •7 Conclusion
- •References
- •Searching for Information with Meet and Join Operators
- •1 Introduction
- •2 Background
- •2.1 Vector Spaces
- •2.2 Sets Versus Vector Spaces
- •2.3 The Boolean Model for IR
- •2.5 The Probabilistic Models
- •3 Meet and Join
- •4 Structures of a Query-by-Theme Language
- •4.1 Features and Terms
- •4.2 Themes
- •4.3 Document Ranking
- •4.4 Meet and Join Operators
- •5 Implementation of a Query-by-Theme Language
- •6 Related Work
- •7 Discussion and Future Work
- •References
- •Index
- •Preface
- •Organization
- •Contents
- •Fundamentals
- •Why Should We Use Quantum Theory?
- •1 Introduction
- •2 On the Human Science/Natural Science Issue
- •3 The Human Roots of Quantum Science
- •4 Qualitative Parallels Between Quantum Theory and the Human Sciences
- •5 Early Quantitative Applications of Quantum Theory to the Human Sciences
- •6 Epilogue
- •References
- •Quantum Cognition
- •1 Introduction
- •2 The Quantum Persuasion Approach
- •3 Experimental Design
- •3.1 Testing for Perspective Incompatibility
- •3.2 Quantum Persuasion
- •3.3 Predictions
- •4 Results
- •4.1 Descriptive Statistics
- •4.2 Data Analysis
- •4.3 Interpretation
- •5 Discussion and Concluding Remarks
- •References
- •1 Introduction
- •2 A Probabilistic Fusion Model of Trust
- •3 Contextuality
- •4 Experiment
- •4.1 Subjects
- •4.2 Design and Materials
- •4.3 Procedure
- •4.4 Results
- •4.5 Discussion
- •5 Summary and Conclusions
- •References
- •Probabilistic Programs for Investigating Contextuality in Human Information Processing
- •1 Introduction
- •2 A Framework for Determining Contextuality in Human Information Processing
- •3 Using Probabilistic Programs to Simulate Bell Scenario Experiments
- •References
- •1 Familiarity and Recollection, Verbatim and Gist
- •2 True Memory, False Memory, over Distributed Memory
- •3 The Hamiltonian Based QEM Model
- •4 Data and Prediction
- •5 Discussion
- •References
- •Decision-Making
- •1 Introduction
- •1.2 Two Stage Gambling Game
- •2 Quantum Probabilities and Waves
- •2.1 Intensity Waves
- •2.2 The Law of Balance and Probability Waves
- •2.3 Probability Waves
- •3 Law of Maximal Uncertainty
- •3.1 Principle of Entropy
- •3.2 Mirror Principle
- •4 Conclusion
- •References
- •1 Introduction
- •4 Quantum-Like Bayesian Networks
- •7.1 Results and Discussion
- •8 Conclusion
- •References
- •Cybernetics and AI
- •1 Introduction
- •2 Modeling of the Vehicle
- •2.1 Introduction to Braitenberg Vehicles
- •2.2 Quantum Approach for BV Decision Making
- •3 Topics in Eigenlogic
- •3.1 The Eigenlogic Operators
- •3.2 Incorporation of Fuzzy Logic
- •4 BV Quantum Robot Simulation Results
- •4.1 Simulation Environment
- •5 Quantum Wheel of Emotions
- •6 Discussion and Conclusion
- •7 Credits and Acknowledgements
- •References
- •1 Introduction
- •2.1 What Is Intelligence?
- •2.2 Human Intelligence and Quantum Cognition
- •2.3 In Search of the General Principles of Intelligence
- •3 Towards a Moral Test
- •4 Compositional Quantum Cognition
- •4.1 Categorical Compositional Model of Meaning
- •4.2 Proof of Concept: Compositional Quantum Cognition
- •5 Implementation of a Moral Test
- •5.2 Step II: A Toy Example, Moral Dilemmas and Context Effects
- •5.4 Step IV. Application for AI
- •6 Discussion and Conclusion
- •Appendix A: Example of a Moral Dilemma
- •References
- •Probability and Beyond
- •1 Introduction
- •2 The Theory of Density Hypercubes
- •2.1 Construction of the Theory
- •2.2 Component Symmetries
- •2.3 Normalisation and Causality
- •3 Decoherence and Hyper-decoherence
- •3.1 Decoherence to Classical Theory
- •4 Higher Order Interference
- •5 Conclusions
- •A Proofs
- •References
- •Information Retrieval
- •1 Introduction
- •2 Related Work
- •3 Quantum Entanglement and Bell Inequality
- •5 Experiment Settings
- •5.1 Dataset
- •5.3 Experimental Procedure
- •6 Results and Discussion
- •7 Conclusion
- •A Appendix
- •References
- •Investigating Bell Inequalities for Multidimensional Relevance Judgments in Information Retrieval
- •1 Introduction
- •2 Quantifying Relevance Dimensions
- •3 Deriving a Bell Inequality for Documents
- •3.1 CHSH Inequality
- •3.2 CHSH Inequality for Documents Using the Trace Method
- •4 Experiment and Results
- •5 Conclusion and Future Work
- •A Appendix
- •References
- •Short Paper
- •An Update on Updating
- •References
- •Author Index
- •The Sure Thing principle, the Disjunction Effect and the Law of Total Probability
- •Material and methods
- •Experimental results.
- •Experiment 1
- •Experiment 2
- •More versus less risk averse participants
- •Theoretical analysis
- •Shared features of the theoretical models
- •The Markov model
- •The quantum-like model
- •Logistic model
- •Theoretical model performance
- •Model comparison for risk attitude partitioning.
- •Discussion
- •Authors contributions
- •Ethical clearance
- •Funding
- •Acknowledgements
- •References
- •Markov versus quantum dynamic models of belief change during evidence monitoring
- •Results
- •Model comparisons.
- •Discussion
- •Methods
- •Participants.
- •Task.
- •Procedure.
- •Mathematical Models.
- •Acknowledgements
- •New Developments for Value-based Decisions
- •Context Effects in Preferential Choice
- •Comparison of Model Mechanisms
- •Qualitative Empirical Comparisons
- •Quantitative Empirical Comparisons
- •Neural Mechanisms of Value Accumulation
- •Neuroimaging Studies of Context Effects and Attribute-Wise Decision Processes
- •Concluding Remarks
- •Acknowledgments
- •References
- •Comparison of Markov versus quantum dynamical models of human decision making
- •CONFLICT OF INTEREST
- •Endnotes
- •FURTHER READING
- •REFERENCES
suai.ru/our-contacts |
quantum machine learning |
94 |
B. Wang et al. |
Fig. 12 A demo of SQuAD dataset [85]
Question Answering Differently from expert systems with structured knowledge, question answering in IR is more about retrieval and ranking tasks in limited unstructured document candidates. In some literature, reading comprehension is also considered a question answering task like SQuAD QA. Generally speaking, reading comprehension is a question answering task in a speciÞc context like a long document with some internal phrases or sentences as answers, as shown in Fig. 12. Table 1 reports current popular QA datasets.
In order to compare the neural matching model and non-neural models, we focus on TREC (answer selection), which has limited answer candidates, instead of an unstructured document as context in reading comprehension. Some matching methods are shown in Table 2, which mainly refers to the ACL wiki page.7
3.4 Seq2seq Application
Seq2seq is a kind of task with both input and output as sequential objects, like a machine translation task. It mainly uses an encoderÐdecoder architecture [19, 100] and further attention mechanisms [3], as shown in Fig. 13. Both the encoder and decoder can be implemented as RNN [19], CNN [34], or only attention mechanisms (i.e., Transformer [111]).
7https://aclweb.org/aclwiki/Question_Answering_(State_of_the_art).
suai.ru/our-contacts |
quantum machine learning |
Representing Words in Vector Space and Beyond |
|
95 |
|
Table 1 Popular QA dataset |
|
|
|
|
|
|
|
Dataset |
Characteristics |
Main institution |
Venue |
TREC QA [119]a |
Open-domain question answering |
CMU |
EMNLP |
|
|
|
2007 |
Insurance QA [32] |
Question answering for insurance |
IBM Watson |
ASRU 2015 |
Wiki QA [123] |
Open-domain question answering |
MS |
EMNLP |
|
|
|
2015 |
Narrative QA [53] |
Reading Comprehension |
DeepMind |
TACL 2018 |
SQuAD 1.0 [85] |
Questions for machine comprehension |
Standford |
EMNLP |
|
|
|
2016 |
MS Marco [76] |
Human-generated machine reading |
MS. |
NIPS 2016 |
NewsQA [107, 108] |
Reading comprehension |
Maluuba |
RepL4NLP |
|
|
|
2017 |
|
|
|
|
TriviaQA [48] |
Reading comprehension distantly |
Allen AI |
ACL 2017 |
|
supervised labels |
|
|
|
|
|
|
SQA [47] |
Sequential question answering |
U. of Maryland |
ACL 2017 |
|
|
& MS. |
|
CQA [102] |
QA with knowledge base of web |
Tel-Aviv |
NAACL |
|
|
university |
2018 |
CSQA [92] |
Complex sequential QA |
IBM |
AAAI 2018 |
QUAC [20]b |
Question answering in context |
Allen AI |
EMNLP |
|
|
|
2018 |
SQuAD 2.0 [84] |
SQuAD with unanswered questions |
Standford |
ACL 2018 |
CoQA [87]c |
Conversational question answering |
Standford |
Aug. 2018 |
Natural questions [57] |
Natural questions in Google search |
TACL 2019 |
The frequent publishing of QA datasets demonstrates that the academic community is paying more and more attention to this task. Almost all the researchers in this community tend to use word embedding-based neural networks for this task
ahttp://cs.stanford.edu/people/mengqiu/data/qg-emnlp07-data.tgz
bhttp://quac.ai/
chttps://stanfordnlp.github.io/coqa/
3.5 Evaluation
The basic evaluations of word embedding techniques are based on the above applications [94], e.g., word-level evaluation and downstream NLP tasks like those mentioned in the last section, as shown in [58]. Especially for a downstream task, there are two common ways to use word embedding, namely as Þxed features or by treating it only as initial weights and Þne-tuning it. We mainly divide it into two part of evaluations, i.e., context-free word properties and embedding-based downstream NLP tasks, while the latter may involve the context and the embedding can be Þnetuned.
Word Property Examples of the context-free word properties include word polarity classiÞcation, word similarity, word analogy, and recognition of synonyms and antonyms. In particular, one of the typical tasks is called an analogy task [70], which
suai.ru/our-contacts |
quantum machine learning |
96 |
B. Wang et al. |
Table 2 State-of-the-art methods for sentence selection, where the evaluation relies on the TREC QA dataset
Algorithm |
Reference |
MAP |
MRR |
Mapping dependencies trees [82] |
AI and math |
0.419 |
0.494 |
|
Symposium 2004 |
|
|
Dependency relation [22] |
SIGIR 2005 |
0.427 |
0.526 |
Quasi-synchronous grammar [119] |
EMNLP 2007 |
0.603 |
0.685 |
Tree edit models [42] |
NAACL 2010 |
0.609 |
0.692 |
Probabilistic tree edit models [118] |
COLING 2010 |
0.595 |
0.695 |
Tree edit distance [124] |
NAACL 2013 |
0.631 |
0.748 |
Question classiÞer, NER, and tree kernels [95] |
EMNLP 2013 |
0.678 |
0.736 |
Enhanced lexical semantic models [126] |
ACL 2013 |
0.709 |
0.770 |
DL with bigram+count [128] |
NIPS 2014 DL |
0.711 |
0.785 |
|
workshop |
|
|
LSTMÑthree-layer BLSTM+BM25 [116] |
ACL 2015 |
0.713 |
0.791 |
Architecture-II [32, 45] |
NIPS 2014 |
0.711 |
0.800 |
L2R + CNN + overlap [96] |
SIGIR 2015 |
0.746 |
0.808 |
aNMM: [122] attention-based neural matching model |
CIKM 2016 |
0.750 |
0.811 |
|
|
|
|
Holographic dual LSTM architecture [104] |
SIGIR 2017 |
0.750 |
0.815 |
|
|
|
|
Pairwise word interaction modeling [40] |
NAACL 2016 |
0.758 |
0.822 |
|
|
|
|
Multi-perspective CNN [39] |
EMNLP 2015 |
0.762 |
0.830 |
|
|
|
|
HyperQA (hyperbolic embeddings) [103] |
WSDM 2018 |
0.770 |
0.825 |
|
|
|
|
PairwiseRank + multi-perspective CNN [86] |
CIKM 2016 |
0.780 |
0.834 |
|
|
|
|
BiMPM [120] |
IJCAI 2017 |
0.802 |
0.875 |
|
|
|
|
Compare-aggregate [8] |
CIKM 2017 |
0.821 |
0.899 |
|
|
|
|
IWAN [97] |
EMNLP 2017 |
0.822 |
0.889 |
IWAN + sCARNN [106] |
NAACL 2018 |
0.829 |
0.875 |
NNQLM [131] |
AAAI 2018 |
0.759 |
0.825 |
Multi-cast attention networks (MCAN) [105] |
KDD 2018 |
0.838 |
0.904 |
Recent papers about TREC QA used embedding-based neural network approaches, while previous ones were based on some traditional methods like IR approaches and edit distance
mainly targets both the syntactic and semantic analogies. For instance, Òman is to womanÓ is semantically similar to Òking is to queen,Ó while Òpredict is to predictingÓ is syntactically similar to Òdance is to dancing.Ó Word Embedding methods achieve good performance in the above word-level tasks, which demonstrates that the word embedding can capture the basic semantic and syntactic properties of the word.
Downstream Task If word embedding is used in a context, which means we consider each word in a phrase or sentence for a speciÞc target, we can train the word embedding by using the labels of the speciÞc task, e.g., sequential labeling, text classiÞcation, text matching, and machine translation. These tasks are divided by the pattern of input and output, shown in Table 3.