
Re: Warsaw Univ. course, was Re: Work on Basic
Posted:
Jun 3, 2013 11:13 PM


> I have no objection to a data type that might reasonably be called > Vector, or perhaps Sequence, that implements a data structure that uses > sequential storage. I object to calling it List. >
I'd probably agree with this. But it is too late by now, in any case.
> > > > Try using your implementation to implement some 1000 x 1000 matrix > algebra! > > I would not particularly advise using linked lists for a dense matrix, > but linked lists for small matrices might be just fine, or sparse > matrices of large dimension but few entries. In the context of SYMBOLIC > matrices, one generally ends up using smallish sizes because of > explosive growth in costs for doing nontrivial operations with them. > In a numeric context, 1000x1000 or much much larger is more likely. > > > > > OK  perhaps you wouldn't want to use your new list implementation for > > linear algebra, but in that case, where would you want to use it? > > You think that linked lists are of no use? If you have access to > a computer science text on data structures, I suggest you look. Or > use Google for ... applications linked lists. > > > Lisp > > lists are expensive to use unless the algorithm is tailored to generate > > lists in reverse order, and consume them in forward order (OK you can > > reverse a Lisp list, but that consumes space for a copy, but random > > access is still inefficient). > > This is a silly argument. If you know in advance that the size of a > vector is n then you can allocate space for it in Mathematica or in Lisp > as a vector. > If you don't know the size, > adding one more element beyond the allocated space n requires O(n) > time for copying. For a linked list it is O(1) to add an element. > This has consequences for many applications including building queues, > algebraic expression trees, storage management etc. Perhaps you are > familiar with the related notion of a pointer, in the C language. > > It is quite possible to write a lisp program that appends to the end of > a list in time O(1). If you have access to information on lisp, look up > nconc. It is, however a "destructive" operation, and not my favorite > way of building a list. If I want to reverse a list that I know is not > shared (i.e. I just constructed it!) I can use nreverse which is > quite fast, and reverses a list IN PLACE. Accessing elements in a linked > list of just a few elements (say, 8 or 10?) may be faster than accessing > elements in a vector, where array indexes may have to be translated into > byte addresses, checked against array bounds, before retrieving. > Note that once a linked list is stored in memory cache, accesses can be > quite fast. > > If a lisp programmer wants to use an array, that is available in the > language. If a lisp programmer wants to use a linked list, that is > available too. If a Mathematica programmer wants to use a linked list, > there is the (admittedly clumsy) construction I suggested previously, > but basically it not an advertised option. >
Well, I'd say it can be used quite successfully. Here is a recent post I've made where I tried to summarize my and others' experiences with these constructs:
http://mathematica.stackexchange.com/questions/24988/canoneidentifythedesignpatternsofmathematica/25474#25474
In particular, there I mentioned two less trivial problems where linked lists were particularly beneficial in the Mathematica context:
http://mathematica.stackexchange.com/questions/5387/howcaniusemathematicasgraphfunctionstocheatatboggle/5394#5394
and
http://mathematica.stackexchange.com/questions/6704/performancetuningforgamesolvingpegsolitairesenku/6891#6891
The real problem is that Mathematica is a laguage of multiple performance scales. There is a scale associated with the toplevel code (pretty slow), there is a scale associated with the patternmatching (can be order of magnitude faster if the patterns are constructed right), and there is a scale associated with packed arrays and vectorized operations, which is yet an order or even two orders of magnitude faster.
At the same time, Mathematica does not possess general performant data structures (in the sense in which C or Java or e.g. SML, or Lisp, have them). So, one is often forced to shoehorn the problem's code into a set of efficient data structures it has (packed arrays, sparse arrays). While there are a number of techniques available to achieve this, and I tried to summarize those known to me here
http://stackoverflow.com/questions/4721171/performancetuninginmathematica/4723969#4723969
I would agree that for many problems this may lead to rather unnatural and twisted code. The biggest problem with this is that one has to have near expertlevel skills to be able to work effectively with this for *any* problem.
OTOH, for many data processing problems, Mathematica's approach of "work with as much data at once as possible" also has an advantage that it allows a very highlevel thinking and very concise *and* reasonably fast code  a feature probably borrowed conceptually from APL. So, many intermediate users who have mastered vectorized subset of Mathematica can be extremely effective with their work without actually becoming professional programmers. In other words, the vectorized susbset of Mathematica combined with its very highlevel nature and great visualization capabilities constitutes IMO a very effective and rather easy to use DSL.
So, this looks to be a doubleedged sword. I think that the actual problem is to find a right balance,given that the vast majority of Mathematica users are not professional programmers and do not intend to become ones. Your criticism has a lot of valid points from a CS viewpoint, but I don't think that this viewpoint is the most important one for the majority of users. And this seems to be what others were telling here, in different words.
Returning to linked lists, they can still give some of the best toplevel performance possible in Mathematica without using packed arrays, sparse arrays, Compile and other techniques which would constrain the types of data for which they can be applied. In some cases, Mathematica is flexible enough to make up for this gap (to some extent). Here is a link to one such example, where I implemented a version of merge function for the merge sort algorithm, which uses linked lists for general case and automatically creates JITcompiled memoized versions for numerical data:
http://mathematica.stackexchange.com/questions/6931/implementingafunctionwhichgeneralizesthemergingstepinmergesort/6933#6933
So, let me put it this way: Mathematica seems to be a good compromise between being practical and having flawless language design (considered from a pure CS viewpoint), and as a language designed to be used by not professional programmers, does quite well in many practical cases. If we consider its design from the practical viewpoint of the majority of its current heavy users, its design is probably close to optimal however.
> > If you want to look for ways in which lisp can be expensive, I > suggest you learn more about it, for example understanding boxed/unboxed > numbers. > > The idea that someone might write a slow program in a language > is not an indictment of the language if a slightly more experienced or > skilled or knowledgeable person could write an equivalent computation > that runs very fast. It would be nice if even naive programs written > by naive programmers were magically "optimized" into better programs. > This is a challenge in the design and implementation of languages. >
Yes, I agree with this, but for Mathematica, the other path was chosen  to make it practical first. That means make a few reasonably welldesigned DSLs for practical things, and make sure they can be glued together without many semantic inconsistencies. That's what Mathematica currently does.
And I don't exclude that in the future, things will get more optimized (e.g. Compile will be able to handle a wider subset of Mathematica), and some useful things may be added to the language. The model of computation of Mathematica is very general, so it may be able to accommodate new additions to the language.
> E.g. "automatic parallelization" ... > > My criticisms of Mathematica's language (EF) are > based on situations in which the language itself interferes with the > writing of efficient or mathematically consistent or even numerically > accurate programs. And it misuses names... >
You are clearly speaking from the CS viewpoint here. It may be not the only valid point of view.
> > > > > Mathematica also gains from the decision to make lists mutable  again > > this simplifies the design of algorithms and increases performance. > > I have no idea what you are arguing about. Lists in lisp are mutable. > e.g. (setf x '(a b c)) (setf (second x) 'q) > > x is the list (a q c) > > "Lists" in Mathematica may not be. e.g. > f[n_]:=(Print[n];Hold[f[n]]); > q=Table[f[i],{i,1,3}] > q[2]= 2 > FAILS. > ReleaseHold[q[2]] reevaluates not just q[2], but every element in q. > > It prints 1,2,3 >
You seem to use the wrong notation here. You have to use [[ ]] (double brackets), not [ ] (single brackets). Otherwise instead of extracting a part of a list, say, you are trying to index into an indexed variable (hashtable).
Lists *are* mutable in Mathematica, when stored in a symbol. So, if you assign
a = Range[10]
you can then modify some part inplace
a[[3]] = 100
but if you use
b[1] = Range[10]
then you can not use
b[1][[3]] = 100
Part assignments work not only for flat lists, but for any expressions stored in a symbol.
> > > > Clearly Mathematica must store some > > information in the head of every object to keep track of objects that > > need reevaluation, and you wouldn't want each list node to be bloated > > out in that way. > > It actually works the other way. If some element of a "List" is > altered, then every element in that array must be reevaluated. So it > is a big disadvantage. See the example above, with q[2]. >
Again, this is not true. You weren't careful with the syntax and did not test your example.
> > The point I was making regarding "Lists" is that Wolfram is using a > word for something quite different from the list data structure. > This is one among a number of reasons to not use Mathematica for > computer science. >
I would probably agree for the first course in pure CS, for which Scheme will arguably be a better choice. But once the students get past the basics, I agree with others that Mathematica can still teach a lot, if only for its model of computation that is different, and its level of interactivity and availability of many algorithms. And it can be an invaluable tool for analysis and development of algorithms, for the same reasons.
> > Linked lists are used all the time in Mathematica. g[h[x]] is a linked > list. >
It will only be a linked list if you interpret it so, and if g and h are inert heads. In general, this is just a symbolic expression, and you will have to provide a context to it, to assign any semantics to it.
All in all, I don't say that I totally disagree with you  I actually agree with many of your points if I look from the pure CS viewpoint. But, as I already said, this may not be the only one, or even the most relevant one, in the context of currently dominant uses of Mathematica.
And this is not the first time I express this opinion in my reply to you :)
Regards, Leonid
> > > By the way f[x] or Sin[x] looks sort of like a "function" but with > square brackets. But it is not really. It has something to do with > rules, patterns, evaluation, etc. As I recall, in SMP, Wolfram's prior > CAS before Mathematica, such things were called Projections. > This was probably too confusing, In any case the term was dropped. > Maybe the people who knew other meanings for projection were able to > get to him (e.g. graphics people?) in a way that computer scientists > could not. > > > http://www.stephenwolfram.com/publications/articles/computing/81smp/1/text.html > > > RJF >

