Bibliographie

suivant: À propos de ce monter: corpus précédent: Preuve de la proposition Table des matières

Bibliographie

Abott, 1952: Abott, E. (1952).
Flatland. A Romance in Many Dimensions.
Dover Publications, New York.
Albus, 1971: Albus, J. (1971).
A theory of cerebellar function.
Mathematical Biosciences, 10:25-61.
Albus, 1981: Albus, J. (1981).
Brain, Behavior, and Robotics.
Byte Books.
Ameisen, 1999: Ameisen, J. (1999).
La scupture du vivant. Le suicide cellulaire ou la mort créatrice.
Seuil.
Anderson, 1989: Anderson, C. (1989).
Learning to control an inverted pendulum using neural networks.
IEEE Control Systems Magazine, 9(3):31-37.
Bain, 1873: Bain, A. (1873).
Mind and Body. The Theories of Their Relation.
Henry King, London.
Baird, 1995: Baird, L. C. (1995).
Residual algorithms: Reinforcement learning with function approximation.
Proceedings of the Twelfth International Conference on Machine Learning, pages 30-37.
Barto et al., 1983: Barto, A., Sutton, R., and Anderson, C. (1983).
Neurolike adaptive elements that can solve difficult learning control problems.
IEEE Transactions on Systems, Man, and Cybernetics, SMC13:834-846.
Bellman, 1957: Bellman, R. (1957).
Dynamic Programming.
Princeton University Press, Princeton, NJ.
Bersekas et Tsitsiklis, 1996: Bersekas, D. and Tsitsiklis, J. (1996).
Neuro-dynamic Programming.
Athena Press.
Bersini et Gorrini, 1996: Bersini, H. and Gorrini, V. (1996).
Three connectionist implementations of dynamic programming for optimal control: A preliminary comparative analysis.
In Workshop on Neural Networks for Identification and Control in Robotics.
Berthoz, 1997: Berthoz, A. (1997).
Le sens du mouvement.
Odile Jacob.
Bertsekas, 1987: Bertsekas, D. (1987).
Dynamic Programming.
Prentice Hall.
Bouchon, 1988: Bouchon, B. (1988).
Entropic models: a general framework for measures of uncertainty and information.
Logic in Knowledge-Based Systems, Decision and Control, pages 93-105.
Bouchon-Meunier, 1989: Bouchon-Meunier, B. (1989).
Incertitude, information, imprécision: une réflexion sur l'évolution de la théorie de l'information.
Revue Internationale de Systémique, 3(4):375-385.
Brooks, 1986: Brooks, R. A. (1986).
A robust layered control system for a mobile robot.
IEEE Journal of Robotics and Automation, RA-2(1):14-23.
Carnap, 1950: Carnap, R. (1950).
Logical Foundations of Probability.
University of Chicago Press.
Changeux, 1983: Changeux, J. (1983).
L'homme neuronal.
Fayard.
Damasio, 1994: Damasio, A. (1994).
Descartes'Error: Emotion, Reason and the Human Brain.
Picador.
Damasio, 1999: Damasio, A. (1999).
Le sentiment même de soi - Corps, émotions, conscience.
Editions Odile Jacob Sciences.
Daubechies, 1992: Daubechies, I. (1992).
Ten Lectures on Wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics.
SIAM, Philadelphia.
Davesne et Barret, 1999a: Davesne, F. and Barret, C. (1999a).
Constraint based memory units for reactive navigation learning.
In European Workshop on Learning Robots.
Davesne et Barret, 1999b: Davesne, F. and Barret, C. (1999b).
Reactive navigation of a mobile robot using a hierarchical set of learning agents.
In IROS'99.
Dayan, 1992: Dayan, P. (1992).
The convergence of td( $\lambda$ ) for general $\lambda$ .
Machine Learning, 8:341-362.
Dayan et Sejnowski, 1994: Dayan, P. and Sejnowski, T. (1994).
Td( $\lambda$ ) converges with probability 1.
Machine Learning, 14:295-301.
Denton, 1985: Denton, M. (1985).
Evolution: Une théorie en crise.
Flammarion.
d'Espagnat, 1985: d'Espagnat, B. (1985).
Une incertaine réalité.
Gauthier-Villars.
d'Espagnat, 1994: d'Espagnat, B. (1994).
Le réel voilé.
Fayard.
Edelman, 1992: Edelman, G. (1992).
Bright Air, Brilliant Fire: On the Matter of Mind.
Basic Books, New York.
Gaussier et al., 1997: Gaussier, P., Joulain, C., Zrehen, S., Banquet, J., and Revel, A. (1997).
Visual navigation in an open environment without map to appear in iros conference grenoble.
Gaussier et Zrehen, 1995: Gaussier, P. and Zrehen, S. (1995).
Perac: A neural architecture to control artificial animals.
Robotics and Autonomous Systems, 16(2-4):291-320.
Glorennec, 1994: Glorennec, P. (1994).
Fuzzy q-learning and dynamical fuzzy q-learning.
In FUZZ-IEEE'94, Orlando.
Harnad, 1992: Harnad, S. (1992).
Cognition and the symbol grounding problem.
Electronic symposium on computation.
Hartley, 1928: Hartley, R. (1928).
Transmission of information.
Bell System technical Journal, 7:535-563.
Hebb, 1949: Hebb, D. (1949).
The Organization of Behavior.
John Wiley & Sons, New York.
Hilbert, 1928: Hilbert, D. et Ackermann, W. (1928).
Grundzuge der Theoretischen Logik.
Springer, Berlin.
Hsu et al., 1990: Hsu, F., Anantharaman, T., Campbell, M., and Nowatzyk, A. (1990).
A grandmaster chess machine.
Scientific American, 263(4):11-50.
Jacobs, 1993: Jacobs, O. (1993).
Introduction to Control Theory.
Oxford University Press.
James, 1890: James, W. (1890).
Principles of Psychology.
Henry Holt, New York.
Jaulin et Walter, 1993: Jaulin, L. and Walter, E. (1993).
Set inversion via interval analysis for non-linear bounded estimation.
Automatica, 29(4):1053-1064.
Jaulin et al., 1996: Jaulin, L., Walter, E., and Didrit, O. (1996).
Guaranteed robust nonlinear parameter bounding.
In CESA'96 IMACS Multiconference Symposium on Modelling, Analysis and Simulation, volume 2, pages 1156-1163, Lille.
Jouffe, 1997: Jouffe, L. (1997).
Apprentissage de systèmes d'inférence floue par des méthodes de renforcement: application à la régulation d'ambiance dans un bâtiment d'élevage porcin.
Thèse de doctorat, Université de Rennes 1.
Jutten et Herault, 1991: Jutten, C. and Herault, J. (1991).
Blind separation of sources, part i: An adaptive algorithm based on neuromimetic architecture.
Signal Processing, 24:1-10.
Kaebling et al., 1996: Kaebling, L., Littman, M., and Moore, A. (1996).
Reinforcement learning: A survey.
Journal of Artificial Intelligence Research, 4:237-285.
Kahneman et Tversky, 1979: Kahneman, D. and Tversky, A. (1979).
Prospect theory: An analysis of decision under risk.
Econometrica, 47:263-291.
Kalman, 1960: Kalman, R. (1960).
A new approach to linear filtering and prediction problems.
Transaction of the ASME - Journal of Basic Engineering, pages 35-45.
Kohonen, 2001: Kohonen, T. (2001).
Self-Organizing Maps, volume 30.
Springer Series in Information Sciences, Berlin.
Lecerf, 1997: Lecerf, C. (1997).
Une leçon de piano ou la double boucle de l'apprentissage cognitif, volume 3.
Travaux et Documents, Université Paris 8 Vincennes-Saint-Denis.
Lin et Kim, 1991: Lin, C. and Kim, H. (1991).
Cmac-based adaptive critic self-learning control.
IEEE Transactions on Neural Networks, 2:530-533.
Lin, 1992: Lin, L.-J. (1992).
Self-improving reactive agents based on reinforcement learning, planning and teaching.
Machine Learning, 8:293-321.
Maaref et al., 1999: Maaref, H., Barret, C., and Amamou, A. (1999).
Optimization of a fuzzy controller for a reactive navigation.
Computational Intelligence and Applications, pages 193-197.
McCulloch et Pitts, 1943: McCulloch, W. and Pitts, W. (1943).
A logical calculus of the ideas immanent in nervous activity.
Bulletin of Mathematical Biophysics, 5:115-137.
McGovern et al., 1998: McGovern, A., Precup, D., Ravindran, B., Singh, S., and Sutton, R. (1998).
Hierarchical optimal control of mdps.
Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, pages 186-191.
Meyer et Wilson, 1991: Meyer, J. and Wilson, S. (1991).
editors: Proceedings of the first international conference on simulation of adaptive behavior - from animals to animats.
Michel, 1996: Michel, O. (1996).
Khepera simulator package version 2.0: Freeware mobile robot simulator.
http://wwwi3s.unice.fr/~om/khep-sim.html.
Mondada et al., 1994: Mondada, F., Franzi, E., and Ienne, P. (1994).
Mobile robot miniaturization: A tool for investigation in control algorithms.
In Yoshikawa, T. and Miyazaki, F., editors, Proceedings of the Third International Symposium on Experimental Robotics 1993, pages 501-513. Springer Verlag,.
Moore, 1979: Moore, R. (1979).
Methods and Applications of Interval Analysis.
SIAM, Philadelphia.
Munos, 1997: Munos, R. (1997).
Apprentissage par Renforcement, Étude du cas Continu.
Thèse de doctorat, EHESS, CEMAGREF.
Nowé, 1995: Nowé, A. (1995).
Fuzzy reinforcement learning: an overview.
Advances in fuzzy theory and technology.
O'Regan et Noë, 2001: O'Regan, J. and Noë, A. (2001).
A sensorimotor account of vision and visual consciousness.
Behavioral and Brain Sciences, 24(5).
O'Regan et al., 1999: O'Regan, J., Rensink, R., and Clark, J. (1999).
Blindness to scene changes caused by mudsplashes.
Nature, 398.
Pavlov, 1927: Pavlov, I. (1927).
Conditionned refexes.
Oxford University Press.
Pendrith, 1994: Pendrith, M. (1994).
On reinforcement learning of control actions in noisy and non-markovian domains.
Technical report, UNSW Computer Science and Engineering.
Pendrith, 1999: Pendrith, M. (1999).
Reinforcement learning in situated agents: Some theoretical problems and practical solutions.
In 8th European Workshop on Learning Robots.
Pendrith et McGarity, 1998: Pendrith, M. and McGarity, M. (1998).
An analysis of direct reinforcement learning in non-markovian domains.
The Fifteenth International Conference on Machine Learning.
Peng et Williams, 1996: Peng, J. and Williams, R. (1996).
Incremental multi-step q-learning.
Machine Learning, 22:283-290.
Pham et al., 1992: Pham, D., Garrat, P., and Jutten, C. (1992).
Separation of a mixture of independent sources through a maximum likelihood approach.
In Proc. EUSIPCO, pages 771-774.
Pitrat, 1990: Pitrat, J. (1990).
Métaconnaissance - Futur de l'intelligence artificielle.
Hermès.
Pradel et Barret, 1998: Pradel, G. and Barret, C. (1998).
Environment recognition in mobile robotics by means of neural networks.
Journal Européen des Systèmes Automatisés, 32:939-963.
Quinlan, 1984: Quinlan, J. (1984).
Learning efficient classification procedures and their application to chess endgames.
Machine Learning. An Artificial Approach, pages 463-482.
Rasmussen, 1986: Rasmussen, J. (1986).
Information Processing and Human-Machine Interaction; An approach to cognitive engineering.
Elsevier Science Publishing Co, North-Holland.
Reeke et al., 1990: Reeke, G., Sporns, O., and Edelman, G. (1990).
Synthetic neural modeling: The ``darwin'' series of recognition automata.
Proc. of the IEEE, 78(9):1498-1530.
Reich, 1949: Reich, W. (1949).
L'analyse caractérielle.
Payot.
Revel, 1997: Revel, A. (1997).
Contrôle d'un robot mobile autonome par approche neuromimétique.
Thèse de doctorat, Université de Cergy-Pontoise.
Rich, 1983: Rich, E. (1983).
Artificial Intelligence.
McGraw-Hill.
Rosenblatt, 1958: Rosenblatt, F. (1958).
The perceptron: A probabilistic model for information storage and organization in the brain.
Psychological Review, 65:386-408.
Rumelhart et al., 1986: Rumelhart, D., Hinton, G., and Williams, R. (1986).
Learning internal representations by error propagation.
Nature, 323:533-536.
Rummery, 1995: Rummery, G. (1995).
Problem Solving with Reinforcement Learning.
PhD thesis, Cambridge University.
Samuel, 1959: Samuel, A. (1959).
Some studies in machine learning using the game of checkers.
IBM Journal of Research and Development, 3:211-229.
Sauvage, 1999: Sauvage, G. (1999).
Les marchés financiers. Entre hasard et raison: le facteur humain.
Seuil.
Schwartz, 1993: Schwartz, A. (1993).
A reinfocement learning method for maximising undiscounted rewards.
Proceeding of Int. Conf. on Machine Learning.
Shannon, 1948: Shannon, C. (1948).
A mathematical theory of communication.
Bell System technical Journal, 27:379-423,623-656.
Shortliffe et Buchanan, 1975: Shortliffe, E. and Buchanan, B. (1975).
A model of inexact reasoning in medicine.
Mathematical Biosciences, 23:351-379.
Sutton, 1984: Sutton, R. (1984).
Temporal Credit Assignment in Reinforcement Learning.
PhD thesis, University of Massachusetts, Amherst, MA.
Sutton, 1988: Sutton, R. (1988).
Learning to predict by the method of temporal differences.
Machine Learning, 3:9-44.
Sutton, 1996: Sutton, R. (1996).
Generalization in reinforcement learning: Successful examples using sparse coarse coding.
Advances in Neural Information Processing Systems: Proceedings of the 1995 Conference, pages 1038-1044.
Sutton et Barto, 1998: Sutton, R. and Barto, A. (1998).
Reinforcement Learning: An introduction.
MIT Presss, Cambridge, MA.
Tanaka, 1995: Tanaka, K. (1995).
Stability and stabilizability of fuzzy-neural control systems.
IEEE Transactions on Fuzzy Systems, 3(4):438-447.
Tesauro, 1994: Tesauro, G. (1994).
Td-gammon, a self-teaching backgammon program, achieves masterlevel play.
Neural Computation, 6:215-219.
Thagard et Barnes, 1996: Thagard, P. and Barnes, A. (1996).
Emotional decisions.
Proceedings og the Eighteenth Annual Conference of The Cognitive Science Society, pages 426-429.
Thagard et Millgram, 1997: Thagard, P. and Millgram, E. (1997).
Inference to the best plan: A coherence theory of decision.
Goal-Driven Learning, pages 439-454.
Touzet et al., 1995: Touzet, C., Sehad, S., and Giambiasi, N. (1995).
Improving reinforcement learning of obstacle avoidance behavior with forbidden sequences of actions.
In International Conference on Robotics and Manufacturing, Cancun, Mexico, 14-16 June 1995.
Tsitsiklis et Roy, 1996: Tsitsiklis and Roy, V. (1996).
Feature-based methods for large scale dynamic programming.
Machine Learning, 22:59-94.
Tsitsiklis, 1994: Tsitsiklis, J. (1994).
Asynchronous stochastic approximation and q-learning.
Machine Learning, 16:185-202.
Turing, 1936: Turing, A. (1936).
On computable numbers, with an application to the entscheidungsproblem.
Proceedings of the London Mathematical Society, 42(2):230-265.
Turing, 1950: Turing, A. (1950).
Computing machinery and intelligence.
Mind, 59:433-460.
Tversky et Kahneman, 1981: Tversky, A. and Kahneman, D. (1981).
The framing of decisions and the psychology of choice.
Science, 211:453-458.
Wang et al., 1996: Wang, H., Tanaka, H., and Griffin, M. (1996).
An approach to fuzzy control of nonlinear systems: Stability and design issues.
IEEE Transactions on Fuzzy Systems, 4(1):14-23.
Watkins, 1989: Watkins, C. (1989).
Learning from Delayed Rewards.
PhD thesis, King's College, Cambridge, UK.
Weizenbaum, 1976: Weizenbaum, J. (1976).
Computer Power and Human Reason.
W.H. Freeman.
Whitehead, 1992: Whitehead, S. (1992).
Reinforcement Learning for the Adaptive Control of Perception and Action.
PhD thesis, King's College, Cambridge, England.
Wiering, 1999: Wiering, M. (1999).
Explorations in Efficient Reinforcement Learning.
PhD thesis, Universiteit van Amsterdam.
Wiering et Schmidhuber, 1997: Wiering, M. and Schmidhuber, J. (1997).
Hq-learning.
In Adaptive Behavior, volume 6:2, pages 219-246.

2002-03-01